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RPS GENE FAMILY, PRIMERS. PROBES. AND DETECTION METHODS 
Statement as to Federally Sponsor ed Research 
5 This invention was made in part with Government 

funding and the Government therefore has certain rights 
in the invention. 

Background of the Invention 
The invention, relates to recombinant plant nucleic 
10 acids and polypeptides and uses thereof to confer disease 
resistance to pathogens in transgenic plants. 

Plants employ a variety of defensive strategies to 
combat pathogens. One defense response, the so-called 
hypersensitive response (HR) , involves rapid localized 
15 necrosis of infected tissue. In several host-pathogen 
interactions, genetic analysis has revealed a gene-for- 
gene correspondence between a particular avirulence (avr) 
gene in an avirulent pathogen that elicits an HR in a 
host possessing a particular resistance gene. 

20 Summary of the Invention 

In general, the invention features substantially 
pure DNA (for example, genomic DNA, cDNA, or synthetic 
DNA) encoding an Rps .polypeptide as defined below. In 
related aspects, the invention also features a vector, a 

25 cell (e.g., a plant cell), and a transgenic plant or seed 
thereof which includes such a substantially pure DNA 
encoding an Rps polypeptide. 

In preferred embodiments, an RPS gene is the RPS2 
gene of a plant of the genus Arabidopsis. In various 

30 preferred embodiments, the cell is a transformed plant 
cell derived from a cell of a transgenic plant. In 
related aspects, the invention features a transgenic 
plant containing a transgene which encodes an Rps 
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polypeptide that is expressed in plant tissue susceptible 
to infection by pathogens expressing the avrRpt2 
avirulence gene or pathogens expressing an avirulence 
signal similarly recognized by an Rps polypeptide* 
5 In a second aspect, the invention features a 

substantially pure DNA which includes a promoter capable 
of expressing the RPS2 gene in plant tissue susceptible 
to infection by bacterial pathogens expressing the 
avrRpt2 avirulence gene. 

10 In preferred embodiments, the promoter is the 

promoter native to an RPS gene. Additionally, 
transcriptional and translational regulatory regions are 
preferably native to an RPS gene. 

The transgenic plants of the invention are 

15 preferably plants which are susceptible to infection by a 
pathogen expressing an avirulence gene, preferably the 
avrRpt2 avirulence gene. In preferred embodiments the 
transgenic plant is from the group of plants consisting 
of but not limited to Arabidopsis, tomato, soybean, bean, 

20 maize, wheat and rice. 

In another aspect, the invention features a method 
of providing resistance in a plant to a pathogen which 
involves: (a) producing a transgenic plant cell having a 
transgene encoding an Rps 2 polypeptide wherein the 

25 transgene is integrated into the genome of the transgenic 
plant and is positioned for expression in the plant cell; 
and (b) growing a transgenic plant from the transgenic 
plant cell wherein the RPS2 transgene is expressed in the 
transgenic plant. 

30 In another aspect, the invention features a method 

of detecting a resistance gene in a plant cell involving: 
(a) contacting the RPS2 gene or a portion thereof greater 
than 9 nucleic acids, preferably greater than 18 nucleic 
acids in length with a preparation of genomic DNA from 

35 the plant cell under hybridization conditions providing 
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detection of DNA sequences having about 50% or greater 
sequence identity to the DNA sequence of Fig. 2 encoding 
the Rps2 polypeptide. 

In another aspect, the invention features a method 
5 of producing an Rps2 polypeptide which involves: (a) 
providing a cell transformed with DNA encoding an Rps2 
polypeptide positioned for expression in the ceil; (b) 
culturing the transformed cell under conditions for 
expressing the DNA; and (c) isolating the Rps2 

10 polypeptide. 

In another aspect, the invention features 
substantially pure Rps2 polypeptide. Preferably, the 
polypeptide includes a greater than 50 amino acid 
sequence substantially identical to a greater than 50 

15 amino acid sequence shown in Fig. 2, open reading frame 
"a n . Most preferably, the polypeptide is the Arabidopsis 
thaliana Rps2 polypeptide. 

In another aspect, the invention features a method 
of providing resistance in a transgenic plant to 

20 infection by pathogens which do not carry the avrRpt2 
avirulence gene wherein the method includes: (a) 
producing a transgenic plant cell having transgenes 
encoding an Rps2 polypeptide as well as a transgene 
encoding the avrRpt2 gene product wherein the transgenes 

25 are integrated into the genome of the transgenic plant; 
are positioned for expression in the plant cell; and the 
avrRpt2 transgene and, if desired, the RPS2 gene, are 
under the control of regulatory sequences suitable for 
controlled expression of the gene(s); and (b) growing a 

30 transgenic plant from the transgenic plant cell wherein 
the RPS2 and avrRpt2 transgenes are expressed in the 
transgenic plant. 

In another aspect, the invention features a method 
of providing resistance in a transgenic plant to 

35 infection by pathogens in the absence of avirulence gene 
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expression in the pathogen wherein the method involves: 
(a) producing a transgenic plant cell having integrated 
in the genome a transgene containing the RPS2 gene under 
the control of a promoter providing constitutive 
5 expression of the BPS2 gene; and (b) growing a transgenic 
plant from the transgenic plant cell wherein the RPS2 
transgene is expressed constitutively in the transgenic 
plant* 

In another aspect, the invention features a method 

10 of providing controllable resistance in a transgenic 
plant to infection by pathogens in the absence of 
avirulence gene expression in the pathogen wherein the 
method involves: (a) producing a transgenic plant cell 
having integrated in the genome a transgene containing 

15 the RPS2 gene under the control of a promoter providing 
controllable expression of the RPS2 gene; and (b) growing 
a transgenic plant from the transgenic plant cell wherein 
the RPS2 transgene is controllably expressed in the 
transgenic plant. In preferred embodiments, the RPS2 

20 gene is expressed using a tissue-specific or cell type- 
specific promoter, or by a promoter that is activated by 
the introduction of an external signal or agent, such as 
a chemical signal or agent. 

In other aspects, the invention features a 

25 substantially pure oligonucleotide including one or a 
combination of the sequences: 

5' GGNATGGGNGGNNTNGGNAARACNAC 3', [SEQ ID NO: 158] 
wherein N is A, T, G, or C; and R is A or G; 

5' NARNGGNARNCC 3', [SEQ ID NO: 169] wherein N is 

30 A, T, G or C; and R is A or G; 

5'NCGNGWNGTNAKDAWNCGNGA 3', [SEQ ID NO: 159] 
wherein N is A, T, G or C; W is A or T; D is A, G, or T; 
and K is G or T; 
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5' GGWNTBGGWAARACHAC 3' , [SEQ ID NO: 160] wherein 
N is A f T, G or C; H is G or A; B is C, G, or T; H is A, 
C, or T; and W is A or T; 

5' TYGAYGAYRTBKRBRA 3', [SEQ ID NO: 163] wherein R 
5 is G or A; B is C, G r or T; D is A, G r or T; Y is T or C; 
and K is G or T; 

5' TYCCAVAYRTCRTCNA 3', [SEQ ID NO: 164] wherein N 
is A, T, G or C; R is G or A; V is G or C or A; and Y is 
T or C; 

10 5' GGWYTBCCWYTBGCHYT 3', [SEQ ID NO: 170] wherein 

B is C, G, or T; H is A, C, or T; W is A or T; and Y is T 
or C; 

5' ARDGCVARWGGVARNCC 3', [SEQ ID NO: 171] wherein 
N is A, T, G or C; R is G or A; W is A or T; D is A, G, 
15 or T; and V is G, C f or A; and 

5' ARRTTRTCRTADSWRAWYTT 3', [SEQ ID NO: 174] 
wherein R is G or A; W is A or T; D is A, G, or T; S is 
G or C; and Y is C or T. 

In other aspects, the invention features a 
20 recombinant plant gene including one or a combination of 
the DNA sequences: 

5' GGNATGGGNGGNNTNGGNAARACNAC 3', t SE Q ID NO: 162] 
wherein N is A, T, G or C; and R is A or G; 

5' NARNGGNARNCC 3', [SEQ ID NO: 169] wherein N is 
25 A, T, G or C; and R is A or G; 

5' NCGNGWNGTNAKDAWNCGNGA 3', [SEQ ID NO: 167] 
wherein N is A, T, G or C; W is A or T; D is A, G or T; 
and K is G or T. 

In another aspect, the invention feaures a 
30 substantially pure plant polypeptide including one or a 
combiantion of the amino acid sequences: 

Gly Xaa x Xaa 2 Gly Xaa 3 Gly Lys Thr Thr Xaa 4 Xaa 5 , 
[SEQ ID NO: 191] wherein Xaa x is Met or Pro; Xaa 2 is Gly 
or Pro; Xaa 3 is lie, Leu, or Val; Xaa 4 is lie, Leu, or 
35 Thr; and Xaa 5 is Ala or Met; 
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Xaa x Xaa 2 Xaa 3 Leu Xaa 4 Xaa 5 Xaa 6 Asp Asp Xaa 7 
Xaa 8 , [SEQ ID NO; 192] 

wherein Xaa x is Phe or Lys; Xaa 2 is Arg or Lys; Xaa 3 is 
lie, Val, or Phe; Xaa 4 is lie, Leu, or Val; Xaa 5 is lie or 
5 Leu; Xaa 6 is lie or Val; Xaa 7 is lie, Leu, or Val; and 
Xaa 8 is Asp or Trp; 

Xaa x Xaa 2 Xaa 3 Xaa 4 Xaa 5 Thr Xaa 6 Arg, [SEQ ID NO: 

193] 

wherein Xaa x is Ser or Cys; Xaa 2 is Arg or Lys; Xaa 3 is 
10 Phe, lie, or Val; Xaa 4 is lie, or Met; Xaa 5 is lie, Leu, 
or Phe; Xaa 6 is Ser, Cys, or Thr; 

Gly Leu Pro Leu Xaa A Xaa 2 Xaa 3 Xaa 4 , [SEQ ID NO,: 

194] 

wherein Xaa x is Thr, Ala, or Ser; Xaa 2 is Leu or Val; Xaa 3 
15 is lie, Val, or Lys; and Xaa 4 is Val or Thr; and 

Xaa x Xaa 2 Ser Tyr Xaa 3 Xaa 4 Leu, [SEQ ID NO: 195] 
wherein Xaa x is Lys or Gly; Xaa 2 is lie or Phe; Xaa 3 is 
Asp or Lys; and Xaa 4 is Ala, Gly, or Asn. 



In another aspect, the invention features a method 
20 of isolating a disease-resistance gene or fragment 
thereof from a plant cell, involving: (a) providing a 
sample of plant cell DNA; (b) providing a pair of 
. oligonucleotides having sequence homology to a conserved 
region of an HPS disease-resistance gene; (c) combining 
25 the pair of oligonucleotides with the plant cell DNA 
sample under conditions suitable for polymerase chain 
reaction-mediated DNA amplification; and (d) isolating 
the amplified disease-resistance gene or fragment 
thereof . 
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In preferred embodiments, the amplification is 
carried out using a reverse-transcription polymerase 
chain reaction, for example, the RACE method 

In another aspect, the invention features a method 
5 of identifying a plant disease-resistance gene in a plant 
cell, involving: (a) providing a preparation of plant 
cell DNA (for example, from the plant genome) ; (b) 
providing a detectably- labelled DNA sequence (for 
example, prepared by the methods of the invention) having 

10 homology to a conserved region of an RPS gene; (c) 

contacting the preparation of plant cell DNA with the 
detectablly-labelled DNA sequence under hybridization 
conditions providing detection of genes having 50% or 
greater sequence identity; and (d) identifying a disease- 

15 resistance gene by its association with the detectable 
label. 

In another aspect, the invention features a method 
of isolating a disease-resistance gene from a recombinant 
plant cell library, involving: (a) providing a 

20 recombinant plant cell library; (b) contacting the 

recombinant plant cell library with a detectably-labelled 
gene fragment produced according to the PCR method of the 
invention under hybridization conditions providing 
detection of genes having 50% or greater sequence 

25 identity; and (c) isolating a member of a disease- 
resistance gene by its association with the detectable 
label * 

In anotehr aspect, the invention features a method 
of isolating a disease-resistance gene from a recombinant 

30 plant cell library, involving: (a) providing a 

recombinant plant cell library; (b) contacting the 
recombinant plant cell library with a detectably-labelled 
RPS oligonucleotide of the invention under hybridization 
conditions providing detection of genes having 50% or 

35 greater sequence identity; and (c) isolating a disease- 
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resistance gene by its association with the detectable 
label. 

In another aspect, the invention features a 
recombinant plant polypeptide capable of conferring 
5 disease-resistance wherein the plant polypeptide includes 
a P-loop domain or nucleotide binding site domain. 
Preferably, the polypeptide further includes a leucine- 
rich repeating domain. 

In another aspect, the invention features a 

10 recombinant plant polypeptide capable of conferring 

disease-resistance wherein the plant polypeptide contains 
a leucine-rich repeating domain. 

In anotehr aspect, the invention features a plant 
disease-resistance gene isolated according to the method 

15 involving: (a) providing a sample of plant cell DNA; (b) 
providing a pair of oligonucleotidies having sequence 
homology to a conserved region of an RPS disease- 
resistance gene; (c) combining the pair of 
oligonucleotides with the plant cell DNA sample under 

20 conditions suitable for polymerase chain reaction- 
mediated DNA amplification; and (d) isolating the 
amplified disease-resistance gene or fragment thereof. 

In another aspect, the invention features a plant 
disease-resistance gene isolated according to the method 

25 involving: (a) providing a preparation of plant cell DNA; 
(b) providing a detectably- label led DNA sequence having 
. homology to a conserved region of an RPS gene; (c) 
contacting the preparation of plant cell DNA with the 
detectably-labelled DNA sequence under hybridization 

30 conditions providing detection of genes having 50% or 

greater sequence identity; and (d) identifying a disease- 
resistance gene by its association with the detectable 
label. 

In another aspect, the invention features a plant 
35 disease-resistance gene according to the method 
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involving: (a) providing a recombinant plant cell 
library; (b) contacting the recombinant plant cell * 
library with a detectably- labelled RPS gene fragment 
produced according to the method of the invention under 
» 5 hybridization conditions providing detection of genes 

having 50% or greater sequence identity; and (c) 
isolating a disease-resistance gene by its association 
with the detectable label. 

In another aspect, the invention features a method 

10 of identifying a plant disease-resistance gene involving: 
(a) providing a plant tissue sample; (b) introducing by 
biolistic transformation into the plant tissue sample a 
candidate plant disease-resistance gene; (c) expressing 
the candidate plant disease-resistance gene within the 

15 plant tissue sample; and (d) determining whether the 
plant tissue sample exhibits a disease-resistance 
response, whereby a response identifies a plant disease- 
resistance gene. 

Preferably, the plant tissue sample is either 

20 leaf f root, flower, fruit, or stem tissue; the candidate 
plant disease-resistance gene is obtained from a cDNA 
expression library; and the disease-resistance response 
is the hypersensitive response. 

In another aspect, the invention features a plant 

25 disease-resistance gene isolated according to the method 
involving: (a) providing a plant tissue sample; (b) 
. introducing by biolistic transformation into the plant 
tissue sample a candidate plant disease-resistance gene; 
(c) expressing the candidate plant disease-resistance 

30 gene within the plant tissue sample; and (d) determining 
whether the plant tissue sample exhibits a disease- 
resistance response, whereby a response identifies a 
plant disease-resistance gene. 

In another aspect, the invention features a 

35 purified antibody which binds specifically to an rps 
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family protein. Such an antibody may be used in any 
standard immunodetection method for the identification of 
an RPS polypeptide. 

In another aspect, the invention features a DNA 
5 sequence substantially identical to the DNA sequence 
shown in Figure 12. 

In another aspect, the invention features a 
substantially pure polypeptide having a sequence 
substantially identical to a Prf amino acid sequence 

10 shown in Figure 5 (A or B) . 

By "disease resistance gene" is meant a gene 
encoding a polypeptide capable of triggering the plant 
defense response in a plant cell or plant tissue. An RPS 
gene is a disease resistance gene having about 50% or 

15 greater sequence identity to the RPS2 sequence of Fig. 2 
or a portion thereof. The gene, RPS2, is a disease 
resistance gene encoding the Rps2 disease resistance 
polypeptide from Arabidopsis thallana. 

By "polypeptide" is meant any chain of amino 

20 acids, regardless of length or post- trans la tional 

modification (e.g., glycosylation or phosphorylation). 

By "substantially identical" is meant a 
polypeptide or nucleic acid exhibiting at least 50%, 
preferably 85%, more 'preferably 90%, and most preferably 

25 95% homology to a reference amino acid or nucleic acid 
sequence. For polypeptides, the length of comparison 
sequences will generally be at least 16 amino acids, 
preferably at least 20 amino acids, more preferably at 
least 25 amino acids, and most preferably 35 amino acids. 

30 For nucleic acids, the length of comparison sequences 
will generally be at least 50 nucleotides, preferably at 
least 60 nucleotides, more preferably at least 75 
nucleotides, and most preferably 110 nucleotides. 

Sequence identity is typically measured using 

35 sequence analysis software (e.g., Sequence Analysis 
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Software Package of the Genetics Computer Group, 
University of Wisconsin Biotechnology Center, 1710 * 
University Avenue, Madison, WI 53705) . Such software 
matches similar sequences by assigning degrees of 
5 homology to various substitutions, deletions, 

substitutions, and other modifications. Conservative 
substitutions typically include substitutions within the 
following groups: glycine alanine; valine, isoleucine, 
leucine; aspartic acid, glutamic acid, asparagine, 

10 glutamine; serine, threonine; lysine, arginine; and 
phenylalanine, tyrosine. 

By a "substantially pure polypeptide" is meant an 
Rps2 polypeptide which has been separated from components 
which naturally accompany it. Typically, the polypeptide 

15 is substantially pure when it is at least 60%, by weight, 
free from the proteins and naturally-occurring organic 
molecules with which it is naturally associated. 
Preferably, the preparation is at least 75%, more 
preferably at least 90%, and most preferably at least 

20 99%, by weight, Rps2 polypeptide. A substantially pure 
Rps2 polypeptide may be obtained, for example, by 
extraction from a natural source (e.g., a plant cell); by 
expression of a recombinant nucleic acid encoding an Rps2 
polypeptide; or by chemically synthesizing the protein. 

25 Purity can be measured by any appropriate method, e.g., 
those described in column chromatography, polyacrylamide 
gel electrophoresis, or by HPLC analysis. 

A protein is substantially free of naturally 
associated components when it is separated from those 

30 contaminants which accompany it in its natural state. 
Thus, a protein which is chemically synthesized or 
produced in a cellular system different from the cell 
from which it naturally originates will be substantially 
free from its naturally associated components. 

35 Accordingly, substantially pure polypeptides include 
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those derived from eukaryotic organisms but synthesized 
in E. coli or other prokaryotes. 

By "substantially pure DNA" is meant DNA that is 
free of the genes which, in the naturally-occurring 
5 genome of the organism from which the DNA of the 

invention is derived, flank the gene. The term therefore 
includes, for example, a recombinant DNA which is 
incorporated into a vector; into an autonomously 
replicating plasmid or virus; or into the genomic DNA of 

10 a prokaryote or eukaryote; or which exists as a separate 
molecule (e.g., a cDNA or a genomic or cDNA fragment 
produced by PCR or restriction endonuclease digestion) 
independent of other sequences. It also includes a 
recombinant DNA which is part of a hybrid gene encoding 

15 additional polypeptide sequence. 

By "transformed cell" is meant a cell into which 
(or into an ancestor of which) has been introduced, by 
means of recombinant DNA techniques, a DNA molecule 
encoding (as used herein) an Rps2 polypeptide. 

20 By "positioned for expression" is meant that the 

DNA molecule is positioned adjacent to a DNA sequence 
which directs transcription and translation of the 
sequence (i.e., facilitates the production of, e.g., an 
Rps2 polypeptide, a recombinant protein or a RNA 

25 molecule) • 

By "reporter gene" is meant a gene whose 
expression may be assayed; such genes include, without 
limitation, ^-glucuronidase (GUS) , lucif erase, 
chloramphenicol transacetylase (CAT) , and B- 

3 0 galactosidase . 

By "promoter" is meant minimal sequence sufficient 
to direct transcription. Also included in the invention 
are those promoter elements which are sufficient to 
render promoter-dependent gene expression controllable 

35 for cell-type specific, tissue-specific or inducible by 
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external signals or agents; such elements may be located 

in the 5' or 3' regions of the native gene. 

By "operably linked 11 is meant that a gene and a 

regulatory sequence (s) are connected in such a way as to 
* 5 permit gene expression when the appropriate molecules 

(e.g., transcriptional activator proteins) are bound to 

the regulatory sequence (s) . 

By "plant cell" is meant any self -propagating cell 

bounded by a semi-permeable membrane and containing a 
10 plastid. Such a cell also requires a cell wall if further 

propagation is desired. Plant cell, as used herein 

includes, without limitation, algae, cyanobacteria, seeds 

suspension cultures, embryos, meristematic regions, 

callus tissue, leaves, roots, shoots, gametophytes, 
15 sporophytes, pollen, and microspores. 

By "transgene". is meant any piece of DNA which is 

inserted by artifice into a cell, and becomes part of the 

genome of the organism which develops from that cell. 

Such a transgene may include a gene which is partly or 
20 entirely heterologous (i.e. , foreign) to the transgenic 

organism, or may represent a gene homologous to an 

endogenous gene of the organism. 

By "transgenic" is meant any cell which includes a 

DNA sequence which is inserted by artifice into a cell 
25 and becomes part of the genome of the organism which 

develops from that cell. As used herein, the transgenic 

organisms are generally transgenic plants and the DNA 

(transgene) is inserted by artifice into the nuclear or 

plastidic genome. 
30 By "pathogen" is meant an organism whose infection 

into the cells of viable plant tissue elicits a disease 

response in the plant tissue. 

By an "RPS disease-resistance gene" is meant any 

member of the family of plant genes characterized by 
35 their ability to trigger a plant defense response and 
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having at least 20%, preferably 30%, and most preferably 
50% amino acid sequence identity to one of the conserved 
regions of one of the HPS members described herein (i.e., 
either the RPS2 , L6, N, or Prf genes). Representative 
5 members of the RPS gene family include, without • 
limitation, the rps2 gene of Arabidopsis , the L6 gene of 
flax, the Prf gene of tomato, and the N gene of tobacco. 

By "conserved region" is meant any stretch of six 
or more contiguous amino acids exhibiting at least 30%, 

10 preferably 50%, and most preferably 70% amino acid 

sequence identity between two or more of the RPS family 
members, RPS2, L6, N, or Prf. Examples of preferred 
conserved regions arQ shown (as boxed or designated 
sequences) in Figures 5 A and B, 6, 7, and 8 and include, 

15 without limitation, nucleotide binding site domains, 

leucine-rich repeats, leucine zipper domains, and P-loop 
domains. 

By "detectably- label led" is meant any means for 
marking and identifying the presence of a molecule, e.g., 

20 an oligonucleotide probe or primer, a gene or fragment 
thereof, or a cDNA molecule. Methods for detectably- 
labelling a molecule are well known in the art and 
include, without limitation, radioactive labelling (e.g., 
with an isotope such as 32 P or 35 S) and nonradioactive 

25 labelling (e.g., chemi luminescent labelling, e.g., 
fluorescein labelling) • 

By "biolistic transformation" is meant any method 
for introducing foreign molecules into a cell using 
velocity driven microprojectiles such as tungsten or gold 

30 particles. Such velocity-driven methods originate from 
pressure bursts which include, but are not limited to, 
helium-driven, air-driven, and gunpowder-driven 
techniques. Biolistic transformation may be applied to 
the transformation or transfection of a wide variety of 

35 cell types and intact tissues including, without 
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limitation, intracellular organelles (e.g., chloroplasts 
and mitochondria) , bacteria, yeast, fungi, algae, pollen, 
animal tissue, plant .tissue (e.g., leaf, seedling, 
embryo, epidermis, flower, meristem, and root) , pollen, 
5 and cultured cells. 

By "purified antibody" is meant antibody which is 
at least 60%, by weight, free from proteins and 
naturally-occurring organic molecules with which it is 
naturally associated. Preferably, the preparation is at 

10 least 75%, more preferably 90%, and most preferably at 
least 99%, by weight, antibody, e.g., an rps2-specif ic 
antibody. A purified rps antibody may be obtained, for 
example, by affinity chromatography using recombinant ly- 
produced rps protein or conserved motif peptides and 

15 standard techniques. 

By "specifically binds" is meant an antibody which 
recognizes and binds an rps protein but which does not 
substantially recognize and bind other molecules in a 
sample, e.g., a biological sample, which naturally 

20 includes rps protein. 

Other features and advantages of the invention 
will be apparent from the following description of the 
preferred embodiments thereof, and from the claims. 

Detailed Description 
25 The drawings will first be described. 

Drawings 

Figs. 1A - IF are a schematic summary of the 
physical and RFLP analysis that led to the cloning of the 
RPS2 locus. 

30 Fig. 1A is a diagram showing the alignment of the 

genetic and the RFLP maps of the relevant portion of 
Arabidopsis thallana -chromosome IV adapted from the map 
published by Lister and Dean (1993) Plant J. 4:745-750. 
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The RFLP marker L11F11 represents the left arm of the 
YUP11F11 YAC clone. 

Fig. IB is a diagram showing the alignment of 
relevant YACs around the RPS2 locus. YAC constructs 
5 designated YUP16G5, YUP18G9 and YUP11F11 were provided by 
J. Ecker, University of Pennsylvania. YAC constructs 
designated EW3H7, EW11D4, EW11E4, and EW9C3 were provided 
by E. Ward, Ciba-Gjeigy, Inc. 

Fig. 1C is a diagram showing the alignment of 

10 cosmid clones around the RPS2 locus. Cosmid clones with 
the designation H are derivatives of the EW3H7 YAC clone 
whereas those with the designation E are derivatives of 
the EW11E4 YAC clone. Vertical arrows indicate the 
relative positions of RFLP markers between the ecotypes 

15 La-er and the rps2-101N plant. The RFLP markers were 
identified by screening a Southern blot containing more 
than 50 different restriction enzyme digests using either 
the entire part or pieces of the corresponding cosmid 
clones as probes. The cosmid clones described in Fig. 1C 

20 were provided by J. Giraudat, C.N.R.S., Gif-sur-Yvette, 
France . 

Figs. ID and IE are maps of EcoRl restriction 
endonuclease sites ill the cosmids E4-4 and E4-6, 
respectively. The recombination break points surrounding 
25 the RPS2 locus are located within the 4.5 and 7.5 kb 
EcoRI restriction endonuclease fragments. 

Fig. IF is a diagram showing the approximate 
location of genes which encode the RNA transcripts which 
have been identified by polyA* RNA blot analysis. The 
30 sizes of the transcripts are given in kilobase pairs 
below each transcript. 

Fig. 2 [SEQ ID NOS; 1-104, 196-201] is the 
complete nucleotide sequence of cDNA-4 comprising the 
RPS2 gene locus. The three reading frames are shown 
35 below the nucleotide sequence. The deduced amino acid 
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sequence of reading frame "a" is provided and contains 
909 amino acids. The methionine encoded by the ATG start 
codon is circled in open reading frame "a" of Fig. 2. 
The A of the ATG start codon is nucleotide 31 of Fig. 2. 
5 Fig. 3 [SEQ ID NOS: 105-106] is the nucleotide 

sequence of the avrRpt2 gene and its deduced amino acid 
sequence. A potential ribosome binding site is 
underlined. An inverted repeat is indicated by 
horizontal arrows at the 3 ' end of the open reading 

10 frame. The deduced amino acid sequence is provided below 
the nucleotide sequence of the open reading frame. 

Fig. 4 is a schematic summary of the 
complementation analysis that allowed functional 
confirmation that the DNA carried on p4104 and p4115 

15 (encoding cDNA-4) confers RPS2 disease resistance 
activity to Arabldopsis thai i ana plants previously 
lacking RPS2 disease resistance activity. Small vertical 
marks along the "genome 11 line represent restriction 
enzyme EcoKL recognition sites, and the numbers above 

20 this line represent the size, in kilobase pairs (kb) , of 
the resulting DNA fragments (see also Fig. IE) • Opposite 
"cDNAs" are the approximate locations of the coding 
sequences for RNA transcripts (See also Fig. IF) ; 
arrowheads indicate the direction of transcription for 

25 cDNAs 4, 5, and 6. For functional complementation 

experiments, rps2-201C/rps2-201C plants were genetically 
transformed with the Arabidopsis thai i an a genomic DNA 
sequences indicated; these sequences were carried on the 
named plasmids (derivatives of the binary cosmid vector 

30 pSLJ4541) and delivered to the plant via Agrojbacteri urn- 
mediated transformation methods. The disease resistance 
phenotype of the resulting transformants following 
inoculation with P. syringae expressing avrRpt2 is given 
as "Sus." (susceptible, no resistance response) or "Res." 

35 (disease resistant) • 
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Fig. 5A [SEQ ID NOS: 107-136;, AND 142] shows 
regions of sequence similarity between the L-6 protein of 
flax, N protein of tobacco, Prf protein of tomato, and 
rps2 protein of Arabidopsis . 
5 Pig. 5B [SEQ ID NOS: 107, 108, 137-140] shows 

sequence similarity between the N and L-6 proteins. 

Fig. 6 [SEQ ID NOS: 141 and 142] shows a sequence 
analysis of RPS2 polypeptide showing polypeptide regions 
corresponding to an N-terminal hydrophobic region, a 

10 leucine zipper, NBSs (kinase-la, kinase-2, and kinase- 3 
motifs) , and a predicted membrane integrated region. 

Fig. 7 [SEQ ID NOS: 143-146 shows the amino acid 
sequence of the RPS2 LRR (amino acids 505-867). The top 
line indicates the consensus sequences for the RPS2 LRR. 

15 An "X n stands for an arbitrary amino acid sequence and an 
"a" stands for an aliphatic amino acid residue. The 
consensus sequence for the RPS2 LRR is closely related to 
the consensus for the yeast adenylate cyclase CYR1 LRR 
(PX Xa XXL XXL XXLXL XXNXaXXa) . The amino acid residues 

20 that match the consensus sequence are shown in bold. 
Although this figure shows 14 LRRs, the C-terminal 
boundary of the LRR is not very clear because the LRR 
closer to the C-terminus does not fit the consensus 
sequence very well. 

25 Fig. 8 [SEQ ID NO: 3] shows a sequence analysis of 

RPS2, indicating regions with similarity to leucine 
zipper, P-loop, membrane-spanning, and leucine-rich 
repeat motifs. Regions with similarity to defined 
functional domains are indicated with a line over the 

30 relevant amino acids. Potential N-glycosylation 

sequences are marked with a dot, and the location of the 
rps2-201 Thr to Pro mutation at animo acid 668 is marked 
with an asterisk. 

Fig. 9 is a schematic representation of the 

35 transient assay method. The top panel shows the 
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essential principles of the assay. The bottom panel 
shows a schematic representation of the actual transient 
assay procedure. Psp NP53121 is used because it is a 
weak Arabidopsis pathogen , but potent in causing the HR 
• 5 when carrying an avirulence gene. In the absence of an 

HR, the damage to plant cells infected with NP53121 is 
minimal, enhancing the difference of GUS accumulation in 
cells that undergo the HR in comparison to those that do 
not. Prior to bombardment, one half of an Arabidopsis 

10 leaf is infiltrated with P. syringae (stippled side of 
leaf) ; the other half of the leaf serves as a noninf ected 
control, an "internal" reference for the infected side, 
and as a measure of transformation efficiency. 

Fig. 10, panels A-B, are photographs showing the 

15 complementation of the rps2 mutant phenotype using the 
biolistic transient expression assay. The left sides of 
rps2-101C mutant leaves were infiltrated with Psp 
3121/avrRpt2. Infiltrated leaves were cobombarded with 
either 35S-uidA plus AGUS (Panel A) or 35S-uidA plus 35S- 

20 RPS2 (cDNA-2 clone 4) (Panel B) . Note that in Panel B 
the infected side of the leaf shows less GUS activity 
than the uninfected side, indicating that the transformed 
cells on the infected side underwent an HR and that 35S- 
RP52 complemented the mutant phenotype (see Fig. 9} . 

25 Fig. 11 is a schematic representation of pK£x4tr 

showing the structure of this cDNA expression vector. 
For convenience, the multiple cloning site contains the 
8bp recognition sequences for Pmel and NotI and is 
flanked by T7 and T3 promoters. The region spanning the 

30 modified 35S promoter .to the nopaline synthase 3' 

sequences (nos 3 ' ) was cloned into the Hind III-EcoRI 
site of pUC18, resulting in the loss of the EcoRI site. 

Fig. 12 shows a nucleic acid sequence of the 
tomato Prf gene. 
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The Genetic Basis for Resistance to Pathogens 

An overview of the interaction between a plant 
host and a microbial pathogen is presented* The invasion 
of a plant by a potential pathogen can have a range of 
5 outcomes delineated by the following outcomes: either the 
pathogen successfully proliferates in the host, causing 
associated disease symptoms, or its growth is halted by 
the host defenses. In some plant-pathogen interactions, 
the visible hallmark of an active defense response is the 

10 so-called hypersensitive response or "HR". The HR 
involves rapid necrosis of cells near the site of the 
infection and may include the formation of a visible dry 
brown lesion. Pathogens which elicit an HR on a given 
host are said to be avirulent on that host, the host is 

15 said to be resistant , and the plant-pathogen interaction 
is said to be incompatible . Strains which proliferate 
and cause disease on a particular host are said to be 
virulent; in this case the host is said to be 
susceptible , and the plant-pathogen interaction is said 

20 to be compatible 

"Classical" genetic analysis has been used 
successfully to help elucidate the genetic basis of 
plant-pathogen recognition for those cases in which a 
series of strains (races) of a particular fungal or 

25 bacterial pathogen are either virulent or avirulent on a 
series of cultivars (or different wild accessions) of a 
particular host species. In many such cases, genetic 
analysis of both the host and the pathogen revealed that 
many avirulent fungal and bacterial strains differ from 

30 virulent ones by the possession of one or more avirulence 
(avx) genes that have corresponding "resistance" genes in 
the host. This avirulence gene-resistance gene 
correspondence is termed the "gene-for-gene" model 
(Crute, et al., (1985) pp 197-309 in: Mechanisms of 

35 Resistance to Plant Disease. R.S.S. Fraser, ed.; 
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Ellingboe, (1981) Annu. Rev. Phytopathol. 19:125-143; 
Flor, (1971) Annu. R$v. Phytopathol. 9:275-296; Keen and 
Staskawicz, (1988) supra : and Keen et al. in: Application 
of Biotechnology to Plant Pathogen Control. I. chet, ed. , 
5 John Wiley & Sons, 1993, pp. 65-88). According to a 
simple formulation of this model, plant resistance genes 
encode specific receptors for molecular signals generated 
by avr genes. Signal transduction pathway (s) then carry 
the signal to a set of target genes that initiate the HR 

10 and other host defenses (Gabriel and Rolfe, (1990) Annu. 
Rev. Phytopathol. 28:365-391). Despite this simple 
predictive model, the molecular basis of the avr- 
resistance gene interaction is still unknown. 

One basic prediction of the gene-f or-gene 

15 hypothesis has been convincingly confirmed at the 

molecular level by the cloning of a variety of bacterial 
avr genes (Innes, et.al., (1993) J. Bacteriol. 175:4859- 
4869; Dong, et al., (1991) Plant Cell 3:61-72; Whelan et 
al., (1991) Plant Cell 3:49-59; Staskawicz et al., (1987) 

20 J. Bacteriol. 169:5789-5794; Gabriel et al. , (1986) 

P.N.A.S., USA 83:6415-6419; Keen and Staskawicz, (1988) 
Annu. Rev. Microbiol. 42:421-440; Kobayashi et al. , 
(1990) Mol. Plant-Microbe Interact. 3:94-102 and (1990) 
Mol. Plant-Microbe Interact. 3:103-111). Many of these 

25 cloned avirulence genes have been shown to correspond to 
individual resistance genes in the cognate host plants 
and have been shown to confer an avirulent phenotype when 
transferred to an otherwise virulent strain. The avrRpt2 
locus was isolated from Pseudomonas syringae pv. tomato 

30 and sequenced by Innes et al. (Innes, R. et al. (1993) J. 
Bacteriol. 175:4859-4869). Fig. 3 is the nucleotide 
sequence and deduced -amino acid sequence of the avrRpt2 
gene. 

Examples of known signals to which plants respond 
35 when infected by pathogens include harpins from Erwinia 
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(Wei et al. (1992) Science 257:85-88) and Pseudomonas (He 
et al. (1993) Cell 73:1255-1266); avr4 (Joosten et* al. 
(1994) Nature 367:384-386) and avr9 peptides (van den 
Ackerveken et al (1992) Plant J. 2:359-366) from 
5 Cladosporium; PopAl from Pseudomonas (Arlat et al. (1994) 
EMBO J. 13:543-553); avrD-generated lipopolysaccharide 
(Midland et al. (1993) J. Org. Chem. 58:2940-2945); and 
NIP1 from Rhynchosporium (Hahn et al. (1993) Mol. Plant- 
Microbe Interact. 6:745-754). 

10 compared to avr genes, considerably less is known 

about plant resistance genes that correspond to specific 
avr-generated signals. The plant resistance gene, RPS2 
(rps for resistance to Pseudomonas syringae) , the first 
gene of a new, previously unidentif ied class of plant 

15 disease resistance genes corresponds to a specific avr 
gene (avrJ*pt2) . Some of the work leading up to the 
cloning of RPS2 is described in Yu, et al., (1993), 
Molecular Plant-Microbe Interactions 6:434-443 and in 
Kunkel, et al., (1993) Plant Cell 5:865-875. 

20 An apparently unrelated avirulence gene which 

corresponds specifically to plant disease resistance 
gene, Pto, has been isolated from tomato (Lycopersicon 
esculentum) (Martin et al. , (1993) Science 262:1432-1436). 
Tomato plants expressing the Pto gene are resistant to 

25 infection by strains of Pseudomonas syringae pv. tomato 
that express the avrPto avirulence gene. The amino acid 
sequence inferred from the Pto gene DNA sequence displays 
strong similarity to serine-threonine protein kinases, 
implicating Pto in signal transduction. No similarity to 

30 the tomato Pto locus or any known protein kinases was 

observed for RPS2 , suggesting that RPS2 is representative 
of a new class of plant disease resistance genes. 

The isolation of a race-specific resistance gene 
from Zea mays (corn) known as Hml has been reported 

35 (Johal and Briggs (1992) Science 258:985-987). Hml 
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confers resistance against specific races of the fungal 
pathogen Cochliobolus carbonum by controlling degradation 
of a fungal toxin , a strategy that is mechanistically 
distinct from the avirulence-gene specific resistance of 
5 the RPS2-avrRpt2 resistance mechanism. 

The cloned RPS2 gene of the invention can be used 
to facilitate the construction of plants that are 
resistant to specific pathogens and to overcome the 
inability to transfer disease resistance genes between 

10 species using classical breeding techniques (Keen et al., 
(1993), supra ) . There now follows a description of the 
cloning and characterization of an Arabidopsls thaliana 
RPS2 genetic locus, the RPS2 genomic DNA, and the RPS2 
cONA. The avrRpt2 gene and the RPS2 gene, as well as 

15 mutants rps2-101C, rps2-102C, and rps2-201C (also 
designated rps2-201) , are described in Dong, et al., 
(1991) Plant Cell 3:61-72; Yu, et al., (1993) supra : 
Kunkel et al. , (1993) supra ; Whalen et al., (1991), 
supra : and Innes et al., (1993), supra ) . A mutant 

20 designated rps2-101N has also been isolated. The 

identification and cloning of the RPS2 gene is described 
below. 

RPS2 Overcomes Sensiti vity to Pathogens Carrying the 
avrRpt2 Gene 

25 To demonstrate the genetic relationship between an 

. avirulence gene in tKe pathogen and a resistance gene in 
the host, it was necessary first to isolate an avirulence 
gene. By screening PsBudomonas strains that are known 
pathogens of crop plants related to Arabidopsls, highly 

30 virulent strains, P. syringae pv. maculicola (Psm) 
ES4326, P. syringae pv. tomato (Pst) DC3000, and an 
avirulent strain, Pst MM1065 were identified and analyzed 
as to their respective abilities to grow in wild type 
Arabidopsis thaliana plants (Dong et al., (1991) Plant 
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Cell, 3:61-72; Whalen et al., (1991) Plant Cell 3:49-59; 
MM1065 is designated JLIO 6 5 in Whalen et al.)* Psm 
ES4326 or Pst DC3000 can multiply 10 4 fold in Arabldopsis 
thaliana leaves and cause water-soaked lesions that 
5 appear over the course of two days. Pst MM1065 

multiplies a maximum of 10 fold in Arabldopsis thaliana 
leaves and causes the appearance of a mildly chlorotic 
dry lesion after 48 hours. Thus, disease resistance is 
associated with severely inhibited growth of the 
10 pathogen. 

An avirulence gene (avr) of the Pst MM1065 strain 
was cloned using standard techniques as described in Dong 
et al. (1991), Plant Cell 3:61-72; Whalen et al. , (1991) 
supra : and Innes et al., (1993), supra . The isolated 

15 avirulence gene from this strain was designated avrRpt2. 
Normally, the virulent strain Psm ES4326 or Pst DC3000 
causes the appearance of disease symptoms after 48 hours 
as described above. In contrast, Psm ES4 32 6 /avrRpt2 or 
Pst T)C3000/avrRpt2 elicits the appearance of a visible 

20 necrotic hypersensitivity response (HR) within 16 hours 
and multiplies 50 fold less than Psm ES4326 or Pst DC3000 
in wild type Arabldopsis thaliana leaves (Dong et al., 
(1991), supra ; and Whalen et al., (1991), £UE£&) . Thus, 
disease resistance in a wild type Arabldopsis plant 

25 requires, in part, an avirulence gene in the pathogen or 
a signal generated by the avirulence gene. 

The isolation of four Arabldopsis thaliana disease 
resistance mutants has been described using the cloned 
avrRpt2 gene to search for the host gene required for 

30 disease resistance to pathogens carrying the avrRpt2 gene 
(Yu et al., (1993), supra : Kunkel et al., (1993), supra ) . 
The four Arabidopsis thaliana mutants failed to develop 
an HR when infiltrated with Psm ES4 326 /avrRpt2 or Pst 
DC3000 /avrRpt2 as expected for plants having lost their 

35 disease resistance capacity. In the case of one of these 
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mutants , approximately 3000 five to six week old M 2 
ecotype Columbia (Col-0 plants) plants generated by ethyl 
methanesulf onic acid (EMS) mutagenesis were hand- 
inoculated with Psm ES4 32 6/avrRpt2 and a single mutant , 
5 rps2-101C, was identified (Resistance to £seudomonas 
syringae) (Yu et al., (1993), supra ) . 

The second mutant was isolated using a procedure 
that specifically enriches for mutants unable to mount an 
HR (Yu et al., (1993), supra ) . When 10-day old 

10 Arabidopsis thali ana seedlings growing on petri plates 
are infiltrated with Pseudomonas syringae pv. 
phaseolicola (Psp) NPS3121 versus Psp NPS3 12 l/avrRpt2 , 
about 90% of the plants infiltrated with Psp NPS3121 
survive, whereas about 90%-95% of the plants infiltrated 

15 with Psp NPS3121/avrJ?pt2 die. Apparently, vacuum 
infiltration of an entire small Arabidopsis thaliana 
seedling with Psp KP S3 12 l/avrRpt2 elicits a systemic HR 
which usually kills the seedling. In contrast, seedlings 
infiltrated with Psp NPS3121 survive because Psp NPS3121 

20 is a weak pathogen on Arabidopsis thaliana. The second 
disease resistance mutant was isolated by infiltrating 
4000 EMS-mutagenized Columbia M 2 seedlings with Psp 
NPS3 12 l/avrRpt2 • Two hundred survivors were obtained. 
These were transplanted to soil and re-screened by hand 

25 inoculation when the plants reached maturity. Of these 
200 survivors, one plant failed to give an HR when hand- 
infiltrated with Psm ES4 32 6/avrRpt2. This mutant was 
designated rps2-l02C (Yu et al., (1993), supra ) . 

A third mutant, rps2-201C, was isolated in a 

30 screen of approximately 7500 M 2 plants derived from seed 
of Arabidopsis thaliana ecotype Col-0 that had been 
mutagenized with diepoxybutane (Kunkel et al., (1993), 
supra ) . Plants were inoculated by dipping entire leaf 
rosettes into a solution containing Pst DC3 000 /avrRpt2 

35 bacteria and the surfactant Silwet L-77 (Whalen et al., 
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(1991), supra ) r incubating plants in a controlled 
environment growth chamber for three to four days, and 
then visually observing disease symptom development. 
This screen revealed four mutant lines (carrying the 
5 rps2-201C, rps2-2Q2C, rps2-203C, and rps2-204C alleles) , 
and plants homozygous for rps2-201C were a primary 
subject for further study (Kunkel et al., (1993), supra 
and the instant application) * 

Isolation of the fourth rps2 mutant, rps2-l0lN, 

10 has not yet been published. This fourth isolate is 
either a mutant or a susceptible Arabidopsis ecotype. 
Seeds of the Arabidopsis Nossen ecotype were gamma- 
irradiated and then sown densely in flats and allowed to 
germinate and grow through a nylon mesh. When the plants 

15 were five to six weeks old, the flats were inverted, the 
plants were partially submerged in a tray containing a 
culture of Psm ES4326 / avrRpt2 , and the plants were vacuum 
infiltrated in a vacuum desiccator. Plants inoculated 
this way develop an HR within 24 hours. Using this 

20 procedure, approximately 40,000 plants were screened and 
one susceptible plant was identified. Subsequent RFLP 
analysis of this plant suggested that it may not be a 
Nossen mutant but rather a different Arabidopsis ecotype 
that is susceptible to Psm ES4326 / avrRpt2 . This plant is 

25 referred to as rps2-101N* The isolated mutants rps2- 

101C, rps2-102C, rps2-201C, and rps2-101N are referred to 
collectively as the n rps2 mutants". 

The ros2 Mutants Fail to Specifically Respond to the 
Cloned Avirulence Gene. avrRpt2 
30 The RPS2 gene product is specifically required for 

resistance to pathogens carrying the avirulence gene, 
avrRpt2. A mutation in Rps2 polypeptide that eliminates 
or reduces its function would be observable as the 
absence of a hypersensitive response upon infiltration of 
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the pathogen. The rps2 mutants displayed disease 
symptoms or a null response when infiltrated with Psm 
ES4326 I avrRpt2 , Pst DC3 000 /avrRpt2 or Psp 
NPS3 12 l/avrRpt2 , respectively. Specifically, no HR 
5 response was elicited, indicating that the plants were 
susceptible and had lost resistance to the pathogen 
despite the presence of the avrRpt2 gene in the pathogen. 

Pathogen growth in rps2 mutant plant leaves was 
similar in the presence and absence of the avrRpt2 gene. 

10 Psm ES4326 and Psm ES4326 /avrRpt2 growth in rps2 mutants 
was compared and found to multiply equally well in the 
rps2 mutants, at the same rate that Psm Es4326 multiplied 
in wild-type Arabidopsis leaves. Similar results were 
observed for Pst DC3000 and Pst DC3 000 /avrRpt2 growth in 

15 rps2 mutants. 

The rps2 mutants displayed a HR when infiltrated 
with Pseudomonas pathogens carrying other avr genes, Psm 
ES4326/avrB, Pst DC3000/avr£, Psm ES4326/avrRpml , Pst 
DC3000 /avrRpml* The ability to mount an HR to an avr 

20 gene other than avrRpt2 indicates that the rps2 mutants 
isolated by selection with avrRpt2 are specific to 
avrRpt2 . 

Manning and Cloning of the RPS2 Gene 

Genetic analysis of rps2 mutants rps2-101C, rps2- 

25 102C, rps-201C and rps-lOlN showed that they all 

corresponded to genes that segregated as expected for a 
single Mendelian locus and that all four were most likely 
allelic. The four rps2 mutants were mapped to the bottom 
of chromosome IV using standard RFLP mapping procedures 

30 including polymerase chain reaction (PGR) -based markers 
(Yu et al., (1993), supra : Kunkel et al., (1993), supra : 
and Mindrinos, M. , unpublished). Segregation analysis 
showed that rps2-101C and rps2-102C are tightly linked to 
the PCR marker, PG11, while the RFLP marker M600 was used 
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to define the chromosome location of the rps2-201C 
mutation (Fig. 1A) (Yu et al., (1993), supra ; Kunkel et 
al., (1993), supra ) . RPS2 has subsequently been mapped 
to the centromeric side of PGll. 
5 Heterozygous RPS2/rps2 plants display a defense 

response that is intermediate between those displayed by 
the wild- type and homozygous rps2/rps2 mutant plants (Yu, 
et al., (1993), supra ; and Kunkel et al., (1993), supra ) > 
The heterozygous plants mounted an HR in response to Psm 
10 ES4326/avrRpt2 or Pst DC3 000 /a vrRpt2 infiltration; 

however, the HR appeared later than in wild type plants 
and required a higher minimum inoculum (Yu, et al., 
(1993), supra ; and Kunkel et al. , (1993), ££££&) • 

Hiah Resolution Mapping of the RPS2 Gene and RPS2 cDNA 

15 Isolation 

To carry out map-based cloning of the RPS2 gene, 
rps2-101N/rps2-101N was crossed with Landsberg erecta 
RPS2 /PPS2 . Plants of the F x generation were allowed to 
self pollinate (to "self") and 165 F 2 plants were selfed 

20 to generate F 3 families. Standard RFLP mapping 

procedures showed that rps2-101N maps close to and on the 
centromeric side of the RFLP marker, PGll. To obtain a 
more detailed map position, rps2-101N/rps-101N was 
crossed with a doubly marked Landsberg Brecta strain 

25 containing the recessive mutations, cer2 and ap2. The 
genetic distance between cer2 and ap2 is approximately 15 
cM, and the rps2 locus is located within this interval. 
F 2 plants that displayed either a CER2 ap2 or a cer2 AP2 
genotype were collected, selfed, and scored for RPS2 by 

30 inoculating at least 20 F 3 plants for each F 2 with Psm 
ES4326/avrRpt2 . DNA was also prepared from a pool of 
approximately 20 F 3 plants for each F 2 line. The CER2 ap2 
and cer2 AP2 recombinants were used to carry out a 
chromosome walk that is illustrated in Figure l. 
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As shown in Figure 1, RPS2 was mapped to a 28-35 
kb region spanned by cosmid clones E4-4 and E4-6. This 
region contains at least six genes that produce 
detectable transcripts. There were no significant 
5 differences in the sizes of the transcripts or their 
level of expression in the rps2 mutants as determined by 
RNA blot analysis. cDNA clones of each of these 
transcripts were isolated and five of these were 
sequenced. As is described below, one of these 

10 transcripts, cDNA-4, was shown to correspond to the RPS2 
locus. From this study, three independent cDNA clones 
(cDNA-4-4, cDNA-4-5, and cDNA-4-11) were obtained 
corresponding to RPS2 from Columbia ecotype wild type 
plants. The apparent sizes of RPS2 transcripts were 3.8 

15 and 3.1 kb as determined by RNA blot analysis. 

A fourth independent cDNA-4 clone (cDNA-4-2453) 
was obtained using map-based isolation of RPS2 in a 
separate study. Yeast artificial chromosome (YAC) clones 
were identified that carry contiguous, overlapping 

20 inserts of Arabidopsls thaliana ecotype Col-0 genomic DNA 
from the M600 region spanning approximately 900 kb in the 
RPS2 region. Arabidopsis YAC libraries were obtained 
from J. Ecker and E. Ward, supra and from E. Grill (Grill 
and Somerville (1991) Mol. Gen. Genet. 226:484-490). 

25 Cosmids designated "H w and w E n were derived from the YAC 
inserts and were used in the isolation of RPS2 (Fig. l) . 

The genetic and physical location of RPS2 was more 
precisely defined using physically mapped RFLP, RAPD 
(random amplified polymorphic DNA) and CAPS (cleaved 

30 amplified polymorphic sequence) markers. Segregating 
populations from crosses between plants of genotype 
RPS2/RPS2 (No-O wild type) and rps2-201/rps2-201 (Col-0 
background) were used for genetic mapping. The RPS2 
locus was mapped using markers 17B7LE, PG11, M600 and 

35 other markers. For high-resolution genetic mapping, a 
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set of tightly linked RFLP markers was generated using 
insert end fragments from YAC and cosmid clones (Fig, 1) 
(Kunkel et al. (1993), supra ; Konieczny and Ausubel 
(1993) Plant J. 4:403-410; and Chang et al. (1988) PNAS 
5 USA 85:6856-6860). Cosmid clones E4-4 and E4-6 were then 
used to identify expressed transcripts (designated cDNA— 
4, -5, -6, -7, -8 of Fig IF) from this region, including 
the cDNA-4-2453 clone. 

RPS2 DNA Sequen ce Analysis 

10 DNA sequence analysis of cDNA-4 from wild-type 

Col-O plants and from mutants rps2-101C , rps2-l02C , rps2- 
201C and rps2-101N showed that cDNA-4 corresponds to 
RPS2 . DNA sequence analysis of rps2-101C, rps2-102C and 
rps2-201C revealed changes from the wild-type sequence as 

15 shown in Table 1. The numbering system in Table 1 starts 
at the ATG start codon encoding the first methionine 
where A is nucleotide 1. DNA sequence analysis of cDNA-4 
corresponding to mutant rps2-102C showed that it differed 
from the wild type sequence at amino acid residue 476. 

20 Moreover, DNA sequence analysis of the cDNA corresponding 
to cDNA-4 from rps2-101N showed that it contained a 10 bp 
insertion at amino acid residue 581, a site within the 
leucine-rich repeat region which causes a shift in the 
RPS2 reading frame. Mutant rps2-101C contains a mutation 

25 that leads to the formation of a chain termination codon. 
The DNA sequence of mutant allele rps2-201C revealed a 
mutation altering a single amino acid within a segment of 
the LRR region that also has similarity to the helix- 
loop-helix motif, further supporting the designation of 

30 this locus as the HPS 2 gene. The DNA and amino acid 
sequences cure shown in Figure 2. 
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Mutant Wild type 

rps2-101C 703 TGA 705 
5 rps2-101N 1741 GTG 1743 



rps2-102C 1426 AGA 1428 

arg 

rps2-201C 2002 ACC 2004 
10 thr 



Table 1 

position of 
mutation 

704 



1741 



1427 



2002 



Change 
TAA Stop Codon 



GTGGAGTTGTATG 
Insertion 

AAA Amino acid 476 
lys 

CCC Amino acid 
pro 



DNA sequence analysis of cDNA-4 corresponding to 
RPS2 from wild- type Col-0 plants revealed an open reading 
frame (between two stop codons) spanning 2,751 bp. There 
are 2,727 bp between the first methionine codon of this 

15 reading frame and the 3 '-stop codon, which corresponds to 
a deduced 909 amino acid polypeptide (See open reading 
frame "a" of Fig. 2). The amino acid sequence has a 
relative molecular weight of 104,460 and a pi of 6.51. 

As discussed below, KPS2 belongs to a new class of 

20 disease resistance genes; the structure of the Rps2 
polypeptide does not resemble the protein structure of 
the product of the previously cloned and publicized 
avirulence gene-specific plant disease resistance gene, 
Pto, which has a putative protein kinase domain. From 

25 the above analysis of the deduced amino acid sequence, 
RPS2 contains several distinct protein domains conserved 
in other proteins from both eukaryotes and prokaryotes. 
These domains include, but are not limited, to Leucine 
Rich Repeats (LRR) (Kobe and Deisenhofer, (1994) Nature 

30 366:751-756); nucleotide binding site, e.g. the kinase la 
motif (P-loop) (Saraste et al. (1990) Trends in 
Biological Sciences TIBS 15:43 0-434; Helix-Loop-Helix 
(Murre et al. (1989) "Cell 56:777-783; and Leucine Zipper 
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(Rodrigues and Park (1993) Mol. Cell Biol. 13:6711-6722). 
The amino acid sequence of Rps2 contains a LRR motif (LRR 
motif from amino acid residue 505 to amino acid residue 
867) , which is present in many known proteins and which 
5 is thought to be involved in protein-protein interactions 
and may thus allow interaction with other proteins that 
are involved in plant disease resistance. The N-terminal 
portion of the Rps2 polypeptide LRR is, for example, 
related to the LRR of yeast (Saccharomyces cerevisiae) 

10 adenylate cyclase, CYR1. A region predicted to be a 
transmembrane spanning domain (Klein et al. (1985) 
Biochim., Biophys. Acta 815:468-476) is located from 
amino acid residue 350 to amino acid residue 365, N- 
terminal to the LRR. An ATP/GTP binding site motif (P- 

15 loop) is predicted to be located between amino acid 

residue 177 and amino acid residue 194, inclusive. The 
motifs are discussed in more detail below. 

From the above analysis of the deduced amino acid 
sequence, the Rps2 polypeptide may have a membrane* 

20 receptor structure which consists of an N-terminal 

extracellular region and a C-terminal cytoplasmic region. 
Alternatively, the topology of the Rps2 may be the 
opposite: an N-terminal cytoplasmic region and a C- 
terminal extracellular region. LRR motifs are 

25 extracellular in many cases and the Rps2 LRR contains 
five potential N-glycosylation sites. 

Identification of RPS2 bv Functional Complementation 
Complementation of rps2-201 homozygotes with 
genomic DNA corresponding to Arabidopsis thai i ana 
30 functionally confirmed that the genomic region encoding 
cDNA-4 carries RPS2 activity. Cosmids were constructed 
that contained overlapping contiguous sequences of wild 
type Arabidopsis thai i ana DNA from the RPS2 region 
contained in YACs EW11D4, EW9C3 , and YUP11F1 of Fig. 1 
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and Fig. 4. The cosmid vectors were constructed from 
pSLJ4541 (obtained from J. Jones, Sainsbury Institute, 
Norwich, England) which contains sequences that allow the 
inserted sequence to be integrated into the plant genome 
5 via AgroJbacterium-mediated transformation (designated 
"binary cosmid"). "H" and "E" cosmids (Fig. 1) were used 
to identify clones carrying DNA from the Arabiddpsis 
thaliana genomic RPS2 region. 

More than forty binary cosmids containing inserted 

10 RPS2 region DNA were used to transform rps2-201 

homozygous mutants utilizing AgroJbacteri urn-mediated 
transformation (Chang et al. ((1990) p. 28, Abstracts of 
the Fourth International Conference on Arabidopsis 
Research, Vienna, Austria) . Transformants which remained 

15 susceptible (determined by methods including the observed 
absence of an HR following infection to P. syringaB pv. 
phaseollcola strain 3121 carrying avrRpt2 and Psp 3121 
without avrRpt2) indicated that the inserted DNA did not 
contain functional RPS2 . These cosmids conferred the 

20 "Sus." or susceptible phenotype indicated in Fig. 4. 

Transformants which had acquired avrRpt2 -specific disease 
resistance (determined by methods including the display 
of a strong hypersensitive response (HR) when inoculated 
with Psp 3121 with avrRpt2, but not following inoculation 

25 with Psp 3121 without avrRpt2) suggested that the 

inserted DNA contained a functional RPS2 gene capable of 
conferring the "Res." or resistant phenotype indicated in 
Fig. 4. Transformants obtained using the pD4 binary 
cosmid displayed a strong resistance phenotype as 

30 described above. The presence of the insert DNA in the 
transformants was confirmed by classical genetic analysis 
(the tight genetic linkage of the disease resistance 
phenotype and the kanamycin resistance phenotype 
conferred by the cotransf ormed selectable marker) and 

35 Southern analysis. These results indicated that RPS2 is 
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encoded by a segment of the 18 kb Arabidopsis thai i ana 
genomic region carried on cosmid pD4 (Fig. 4) . 

To further localize the RPS2 locus and confirm its 
ability to confer a resistance phenotype on the rps2-201 
5 homozygous mutants, a set of six binary cosmids 

containing partially overlapping genomic DNA inserts were 
tested. The overlapping inserts pD2, pD4, pD14, pD15, 
pD27, and pD47 were chosen based on the location of the 
transcription corresponding to the five cDNA clones in 

10 the RPS2 region (Fig. 4)- These transformation 

experiments utilized a vacuum infiltration procedure 
(Bechtold et al. (1993) C.R. Acad. Sci. Paris 316:1194- 
1199) for AgroJbacteri urn-mediated transformation. 
AgroJbacterium-mediated transformations with cosmids pD2 r 

15 pD14, pD15, pD39, and pD46 were performed using a root 
transformation/regeneration protocol (Valveekens et al. 
(1988), PNAS 85:5536-5540). The results of pathogen 
inoculation experiments assaying for RPS2 activity in 
these transformants is indicated in Fig. 4. 

20 These experiments were further confirmed using a 

modification of the vacuum filtration procedure. In 
particular, the procedure of Bechtold et al. ( supra ) was 
modified such that plants were grown in peat-based 
potting soil covered with a screen, primary 

25 inflorescences were removed, and plants with secondary 
inflorescences (approximately 3 to 15 cm in length) were 
inverted directly into infiltration medium, infiltrated, 
and then grown to seed harvest without removal from soil 
(detailed protocol available on the AAtOB computer 

30 database (43). The presence of introduced sequences in 
the initial pD4 transformant was verified by DNA blot 
analysis with a pD4 vector and insert sequences 
(separately) as probes. The presence of the expected 
sequences in transformants obtained with the vacuum 

35 infiltration protocol was also confirmed by DNA blot 
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analysis. Root transformation experiments (19) were 
performed with an easily regenerable rps2-201/rps2-20l x 
No-0 mapping population. Transf ormants were obtained for 
pD4 with in plant transformation, for pD2, 14, 16, 39, 
5 and 49 with root transformation, and for pD2, 4, 14, 15, 
27, and 47 with vacuum infiltration as modified. 

Additional transformation experiments utilized 
binary cosmids carrying the complete coding region and 
more than 1 kb of upstream genomic sequence for only 

10 cDNA-4 or cDNA-6. Using the vacuum infiltration 

transformation method, three independent transf ormants 
were obtained that carried the wild-type cDNA-6 genomic 
region in a rps2-20lc homozygous background (pAD43l of 
Fig. 4). None of these plants displayed avrRpt2- 

15 dependent disease resistance. Homozygous rps2-20lc 
mutants were transformed with wild-type genomic cDNA-4 
(p4l04 and p4H5, each carrying Col-0 genomic sequences 
corresponding to all of the cDNA-4 open reading frame, 
plus approximately 1.7 kb of 5' upstream sequence and 

20 approximately 0.3 kb of 3' sequence downstream of the 
stop codon) . These p4104 and p4115 transf ormants 
displayed a disease resistance phenotype similar to the 
wild-type RPS2 homozygotes from which the rps2 were 
derived. Additional mutants (rps2-101N and rps2-l01C 

25 homozygotes) also displayed avrRpt2 -dependent resistance 
when transformed with the cDNA-4 genomic region. 

RPS2 Sequences Allow Detection of Other Resistance Genes 
DNA blot analysis of Arabidopsis thaliana genomic 

DNA using RPS2 cDNA as the probe showed that Arabidopsis 
30 contains several DNA sequences that hybridize to RPS2 or 

a portion thereof, suggesting that there are several 

related genes in the Arabidopsis genome. 

From the aforementioned description and the 

nucleic acid sequence shown in Fig. 2, it is possible to 
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isolate other plant disease resistance genes having about 
50% or greater sequence identity to the RPS2 gene. " 
Detection and isolation can be carried out with an 
oligonucleotide probe containing the RPS2 gene or a 
5 portion thereof greater than 9 nucleic acids in length, 
and preferably greater than about 18 nucleic acids in 
length. Probes to sequences encoding specific structural 
features of the Rps2 polypeptide are preferred as they 
provide a means of isolating disease resistance genes 

10 having similar structural domains. Hybridization can be 
done using standard techniques such as are described in 
Ausubel et al., Current Protocols in Mol&cular Biology, 
John Wiley & Sons, (1989). 

For example, high stringency conditions for 

15 detecting the RPS2 gene include hybridization at about 
42 °C, and about 50% formamide; a first wash at about 
65°C, about 2X SSC, and 1% SDS; followed by a second wash 
at about 65 °C and about 0.1% x SSC. Lower stringency 
conditions for detecting HPS genes having about 50% 

20 sequence identity to the RPS2 gene are detected by, for 
example, hybridization at about 42 °C in the absence of 
formamide; a first wash at about 42 °C, about 6X SSC, and 
about 1% SDS; and a second wash at about 50°C, about 6X 
SSC, and about 1% SDS. An approximately 350 nucleotide 

25 DNA probe encoding the middle portion of the LRR region 
of Rps2 was used as a probe in the above example. Under 
lower stringency conditions, a minimum of 5 DNA bands 
were detected in BamRl digested Arabidopsls thaliana 
genomic DNA as sequences having sufficient sequence 

30 identity to hybridize to DNA encoding the middle portion 
of the LRR motif of Rps2. Similar results were obtained 
using a probe containing a 300 nucleotide portion of the 
RPS2 gene encoding the extreme N-terminus of Rps2 outside 
of the LRR motif. 
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Isolation of other disease resistance genes is 
performed by PCR amplification techniques well known to 
those skilled in the art of molecular biology using 
oligonucleotide primers designed to amplify only 
5 sequences flanked by the oligonucleotides in genes having 
sequence identity to RPS2. The primers are optionally 
designed to allow cloning of the amplified product into a 
suitable vector. 

The RPS Disease-Resistance Gene Family 

10 As discussed above, we have discovered that the 

Arabidopsis RPS2 gene described herein is representative 
of a new class of plant resistance genes. Analysis of 
the derived amino acid sequence for RPS2 revealed several 
regions of similarity with known polypeptide motifs (see, 

15 e.g., Schneider et al., Genes Dev. 6:797 (1991)). Most 
prominent among these is a region of multiple, leucine- 
rich repeats (LRRs) . The LRR motif has been implicated 
in protein-protein interactions and ligand binding in a 
diverse array of proteins (see, e.g., Kornfield et al., 

20 Annu. Rev. Biochem. 64:631 (1985); Alber, Curr. Opin. 
Gen. Dev. 2:205 (1992); Lupas et al., Science 252:116 
2 (1991); Saraste et al. , Trend Biochem. Sci. 15:430 
(1990)). In one example, LRRs form the hormone binding 
sites of mammalian gonadotropin hormone receptors (see, 

25 e.g, Lupas et al. , Science 252:1162 (1991)) and, in 

another example, a domain of yeast adenylate cyclase that 
interacts with the RAS2 protein (Kornfield et al., Annu. 
Rev. Biochem. 64:631 (1985)). In RPS2 , the LRR domain 
spans amino acids 503-867 and contains fourteen repeat 

30 units of length 22-26 amino acids. A portion of each 
repeat resembles the LRR consensus sequence 
(I/L/V)XXLXXLXX(I/L)XL. In Figure 7, the LRRs from RPS2 
are shown, as well as an RPS2 consensus sequence. Within 
the RPS2 LRR region, five (of six) sequences matching the 
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N-glycosylation consensus sequence [NX(S/T)] were 
observed (Figure 8, marked with a dot). In particular, 
N-glycosylation is predicted to occur at amino acids 158, 
543, 666, 757, 778, 787. Interestingly, the single 
5 nucleotide difference between functional RPS2 and mutant 
allele rps2-201 is within the LRR coding region, and this 
mutation disrupts one of the potential glycosylation 
sites. 

Also observed in the deduced amino acid sequence 

10 for RPS2 is a second 'potential protein-protein 

interaction domain, a leucine zipper (see, e.g., von 

Heijne, J. Mol. Biol. 225:487 (1992)), at amino acids 30- 

57. This region contains four contiguous heptad repeats 

that match the leucine zipper consensus sequence 

15 (I/R)XDLXXX. Leucine zippers facilitate the dimerization 

of transcription factors by formation of coiled-coil 

structures, but no sequences suggestive of an adjacent 

DNA binding domain (such as a strongly basic region or a 

potential zinc-finger) were detected in RPS2. Coiled- 

20 coil regions also promote specific interactions between 

proteins that are not transcription factors (see, e.g., 

Ward et al., Plant Mol. Biol. 14:561 (1990); Ecker, 

Methods 1:186 (1990); Grill et al., Mol. Gen. Genet. 

226:484 (1991)), and computer database similarity 

* 

25 searches with the region spanning amino acids 30-57 of 
RPS2 revealed highest similarity to the coiled-coil 
regions of numerous myosin and paramyosin proteins. 

A third RPS2 motif was found at the sequence 
GPGGVGKT at deduced amino acids 182-189. This portion of 

30 RPS2 precisely matches the generalized consensus for the 
phosphate-binding loop (P-loop) of numerous ATP- and GTP- 
binding proteins (see, e.g., Saras te et al., supra)). 
The postulated RPS2 P-loop is similar to those found in 
RAS proteins and ATP synthase 0-subunits (Saraste et al., 

35 supra ) , but surprisingly is most similar to the published 
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P-loop sequences for the nifH and chvD genes, 
respectively. The presence of this P-loop sequence 
strongly suggests nucleotide triphosphate binding as one 
aspect of RPS2 function. This domain is also referred to 
5 as a kinase- la motif (or a nucleotide binding site, or 
NBS) . Other conserved NBSs are present in the RPS2 
sequence; these NBSs include a kinase-2 motif at amino 
acids 258-262 and a kinase-3a motif at amino acids 330- 
335. 

10 Finally, inspection of the RPS2 sequence reveals a 

fourth RPS2 motif, a potential membrane-spanning domain 
located at amino acids 340-360. Within this region, a 
conserved GLPLAL motif is found at amino acids 347-352. 
The presence of the membrane-spanning domain raises the 

15 possibility that the RPS2 protein is membrane localized, 
with the M-terminal leucine zipper and P-loop domains 
residing together on the opposite side of the membrane 
from the LRR region. An orientation in which the C- 
terminal LRR domain is extracellular is suggested by the 

20 fact that five of the six potential N-linked 

glycosylation sites occur C-terminal to the proposed 
membrane-spanning domain, as well as by the overall more 
positive charge of the N- terminal amino acid residues 
(see, e.g., Kornfield et al., supra : von Heijne, supra ) . 

25 A number of proteins that contain LRRs are postulated or 
known to be membrane-spanning receptors in which the LRRs 
are displayed extracellularly as a ligand-binding domain 
(see, e.g., Lopez et al., Proc. Natl. Acad. Sci. 84:5615 
(1987); Braun et al.,. EHBO J. 10:1885 (1991); Schneider 

30 et al., supra) • 

The plant kingdom contains hundreds of resistance 
genes that are necessarily divergent since they control 
different resistance specificities. However, plant 
defense responses such as production of activated oxygen 

35 species, PR-protein gene expression, and the 
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hypersensitive response are common to diverse plant- 
pathogen interactions. This implies that there are 
points of convergence in the defense signal transduction 
pathways downstream of initial pathogen recognition, and 
5 also suggests that similar functional motifs may exist 
among diverse resistance gene products* Indeed, RPS2 is 
dissimilar from previously described disease resistance 
genes such as Hml or Pto (see, e.g., Johal et al., supra : 
Martin et al., supra ) , and thus represents a new class of 
10 genes having disease resistance capabilities. 

Isolation of Other Members of the RPS Disease-Resistance 
Gene Family Using Conserved Motif Probes and Primprs 
We have discovered that the RPS2 motifs 
described above are conserved in other disease-resistance 

15 genes, including, without limitation, the N protein, the 
L6 protein, and the Prf protein. As shown in Fig. 5 (A 
and B) , we have determined that the L6 polypeptide of 
flax, the N polypeptide of tobacco, and the Prf 
polypeptide of tomato each share unique regions of 

20 similarity (including, but not limited to, the leucine- 
rich repeats, the membrane-spanning domain, the leucine 
zipper, and the P-loop and other NBS domains) . 

On the basis of this discovery, the isolation of 
virtually any member of the RPS gene family is made 

25 possible using standard techniques. In particular, using 
all or a portion of the amino acid sequence of a 
conserved RPS motif (for example, the amino acid 
sequences defining any RPS P-loop, NBS, leucine-rich 
repeat, leucine zipper, or membrane-spanning region) , one 

30 may readily design RPS oligonucleotide probes, including 
RPS degenerate oligonucleotide probes (i.e., a mixture of 
all possible coding sequences for a given amino acid 
sequence). These oligonucleotides may be based upon the 
sequence of either strand of the DNA comprising the 
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motif. General methods for designing and preparing such 
probes are provided, for example, in Ausubel et all, 
supra and Guide to Molecular Cloning Techniques, 1987, S. 
L. Berger and A* R. Kimmel, eds., Academic Press, New 
5 York. These oligonucleotides are useful for RPS gene 
isolation, either through their use as probes capable of 
hybridizing to RPS complementary sequences or as primers 
for various polymerase chain reaction (PCR) cloning 
strategies. 

10 Hybridization techniques and procedures are well 

known to those skilled in the art and are described, for 
example, in Ausubel et al., supra and Guide to Molecular 
Cloning Techniques, 1987, S. L. Berger and A. R. Kimmel, 
eds., Academic Press, New York. If desired, a 

15 combination of different oligonucleotide probes may be 
used for the screening of the recombinant DNA library. 
The oligonucleotides are labelled with 32 P using methods 
known in the art, and the detectably-labelled 
oligonucleotides are used to probe filter replicas from a 

20 recombinant plant DNA library. Recombinant DNA libraries 
may be prepared according to methods well known in the 
art, for example, as described in Ausubel et al., supra . 
Positive clones may, if desired, be rescreened with 
additional oligonucleotide probes based upon other RPS 

25 conserved regions. For example, an RPS clone identified 
based on hybridization with a P-loop-derived probe may be 
confirmed by re-screening with a leucine-rich repeat- 
derived oligonucleotide. 

As discussed above, RPS oligonucleotides may also 

30 be used as primers in PCR cloning strategies. Such PCR 
methods are well known in the art and described, for 
example, in PCR Technology, H.A. Erlich, ed. , Stockton 
Press, London, 1989; PCR Protocols: A Guide to Methods 
and Applications, M.A. Innis, D.H. Gelfand, J.J. Sninsky, 

35 and T.J. White, eds., Academic Press, Inc., New York, 
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1990; and Ausubel et al., supra > If desired, members of 
the RPS disease-resistance gene family may be isolated 
using the PCR "RACE" technique, or Rapid Amplification of 
cDNA Ends (see, e.g. , Innis et al., supra) . By this 
5 method, oligonucleotide primers based on an RPS conserved 
domain are oriented in the 3' and 5' directions and are 
used to generate overlapping PGR fragments. These 
overlapping 3'- and 5 '-end RACE products are combined to 
produce an intact full-length cDNA. This method is 

10 described in Innis et al., supra : and Frohman et al., 
Proc. Natl. Acad. Sci. 85:8998, 1988. 

Any number of probes and primers according to the 
invention may be designed based on the conserved RPS 
motifs described herein. Preferred motifs are boxed in 

15 the sequences shown in Fig. 5 (A or B) . In particular, 
oligonucleotides according to the invention may be based 
on the conserved P-loop domain, the amino acids of which 
are shown below: 
MOTIF 1 

20 L6 G MGGIGKTTTA [SEQ ID NO: 110] 

N G MGGVGKTTIA [SEQ ID NO: 111] 

PrfP G MPGLGKTTLA [SEQ ID NO: 112] 

RPS2 G PGGVGKTTLM [SEQ ID NO: 113] 

From these sequences, appropriate oligonucleotides are 
25 designed and prepared using standard methods. Particular 
examples of RPS oligonucleotides based on the P-loop 
domain are as follows (N is A, C, T, or G) . 
Based on MOTIF 1: 

5' GGNATGGGNGGNNTNGGNAA (A or G) ACNAC 3' [SEQ ID 
30 NO: 158] 

5' NCGNG (A/T) NGTNA (T/G) (G/ A/T) A (T/A) NCGNA 3' 
[SEQ ID NO: 159] 

5' GG(T or A) NT(T or G or C)GG(T or A) AA (G or 
A)AC(T or C or A) AC 3' [SEQ ID NO: 160] 
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5' GGNATGGGNGGNNTNGGNAA (A or G) ACNAC 3' [SEQ ID 
NO: 158] 

5' N(G or A) (C or T)N(A or G) (A or G or 

T)NGTNGT(C or T) TTNCCNANNCCN (G or L) (G or 
5 C)N(G or A) (T or G)NCC 3' [SEQ ID NO: 161] 

5' GGN(C or A) (T or C)N(G or C) (G or 

C) NGGNNTNGGNAA (A or G) ACNAC 3 ' [ SEQ ID NO: 

162] 

Other conserved RPS motifs useful for 
10 oligonucleotide design are shown below. These motifs are 
also depicted in the sequence of Fig. 5 (A or B) . 



15 



MOTIF 2 

L6 

N 

PrfP 
RPS 2 



FKILW LDDVD [SEQ ID NO: 114] 

KKVLIV LDDID [SEQ ID NO: 115] 

KRFLIL IDDVW [SEQ ID NO: 116] 

KRFLLL LDDVW [SEQ ID NO: 117] 



20 



MOTIF 3 

L6 

N 

PrfP 
RPS2 



SRFIIT SR [SEQ ID NO: 118] 

SRIIIT TR [SEQ ID NO: 119] 

SRIILT TR [SEQ ID NO: 120] 

CKVMFT TR [SEQ ID NO: 121] 



25 



MOTIF 4 

L6 

N 

PrfP 
RPS2 



GLPLTLK V [SEQ ID NO: 122] 

GLPLALK V [SEQ ID NO: 123] 

GLPLSW L [SEQ ID NO: 124] 

GLPLALI T [SEQ ID NO: 125] 



30 



MPTIF 5 
L6 

N 

PrfP 
RPS2 



KISYDAL [SEQ ID NO: 126] 

KISYDGL [SEQ ID NO: 127] 

GFSYKNL [SEQ ID NO: 128] 

KFSYDNL [SEQ ID NO: 129] 
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From the above motifs and the sequence motifs designated 
in Figure 5A and B, appropriate oligonucleotides are 
designed and prepared. Particular examples of such RPS 
oligonucleotides are as follows (N is A, T, C, or G) . 

5 Based on MOTIF 2; 

5' T(T or C)GA(T or C)GA(T or C) (A or G)T(T or G 

or C) (T or G) (A or G) (T or G or C) (G or A) A 

3' [SEQ ID NO: 163] 
• 

5' T(T or C)CCA(G or C or A) A(T or C) (G or 
10 A)TC(A or G) TCNA 3' [SEQ ID NO: 164] 

5' (C or G or A) (T or C) (C or A)NA(T or C) (G or 
A)TC(G or A) TCNA (G or A or T)NA(G or A or 
C)NANNA(G or A)NA 3 ' [SEQ ID NO: 165] 

5' (T or A) (T or A)N(A or C) (A or G) (A or G) (T 
15 or G or A)TN(T or C)TNNTN(G or T or C)TN(A or 

T or C)TNGA(T or C)GA 3' [SEQ ID NO: 166] 

Based on MOTIF 3: 

5' NCGNG (A or T)NGTNA(T or G) (G or A or T)A(T or 
A) NCGNGA 3' [SEQ ID NO: 167] 

20 5' NCGNG (A or T)NGTNA(T or G) (G or A or T)A(T or 

A) NCGNGA 3' [SEQ ID NO: 167] 

5' NC(G or T)N(G or C) (A or T) NGTNA (A or G or 

T) (A or G or T)AT(A or G or T) AATNG 3' [SEQ ID 
NO: 168]. 

25 Based on MOTIF 4: 

5' NA(G or A) NGGNA(G or A)NCC 3' [SEQ ID NO: 169] 

5' GG(T or A) (T or C)T(T or G or C)CC(T or A) (T 
or C)T(T or G or C)GC(T or C or A) (T or C)T 
3' [SEQ ID NO: 170] 

30 5' A (A or G) (T or G or A)GC(G or C or A) A(G or 

A) (T or A) GG (G or C or A) A(G or A) (A or G or 
T or C)C C 3' [SEQ ID NO: 171] 

5' NA(G or A)NGGNA(G or A)NCC 3' [SEQ ID NO: 
169] 

35 5' N (A or G)NN(T or A) (T or C)NA(G or C or A)N(C 

or G) (A or T or C)NA(G or A) NGGNA (G or A)NCC 
3' [SEQ ID NO: 172] 
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5' GGN(T or C)TNCCN(T or C)TN(G or A or T) (C or 
G)N(T or G or C)T 3' [SEQ ID NO: 173] • 

Based on MOTIF 5; 

5' A (A or G) (A or G)TT(A or G)TC(A or G)TA(G or 
5 A or T) (G or C) (T or A) (G or A) A(T or A) (C or 

T)TT 3 ' [SEQ ID NO: 174] 

5' A(G or A) N(T or C) (T or C)NT(C or T) (A or 

G)TAN(G or C) (A or G)NANN(C or T) (C or T) 3' 
[SEQ ID NO: 175] 

10 5' (G or A) (G or A) N(A or T)T(A or C or T) (T or 

A) (G or C)NTA(T or C) (G or A) AN (A or G) (A or 
C or G)N(T or C)T 3' [SEQ ID NO: 176] 

Based on MOTIF 6: 

5' GTNTT(T or C) (T or C)TN(T or A) (G or C)NTT(T 
15 or C) (A or C)G(A or G)GG 3' [SEQ ID NO: 177] 

Based on MOTIF 7: 

5' CCNAT(A or C or T)TT(T or C)TA(T or C) (G or 
A) (T or A) (G or T or C)GTNGA(T or C)CC 3' [SEQ ID 
NO: 178] 

20 Based on MOTIF 8 : 

5' GTNGGNAT (A or C or T)GA(T or C) (G or A) (A or 
C)NCA 3' {SEQ ID NO: 179] 

Based on MOTIF 9 : 

5' (G or A) AA (G or A) CANGC(A or G or T)AT(G or 
25 A)TCNA(G or A) (G or A) AA 3' [SEQ ID NO: 180] 

5 / TT(T or C) (T or C)TNGA(T or C)AT(A or C or 
T) GCNTG (T or C)TT 3' [SEQ ID NO: 181] 

Based on MOTIF 10 : 

5' CCCAT(G or A)TC(T or C) (T or C) (T or G)NA(T 
30 or G or A)N(T or A) (G or A) (G or A)TC(A or 

G)TGCAT 3' [SEQ ID NO: 182] 

5' ATGCA(T or C)GA(T or C) (T or C) (T or A) N (A or 
C or T)TN(A or C) (A or G) (A or G)GA(T or 
C) ATGGG 3' [SEQ ID NO: 183] 

35 Based on MOTIF 11: 
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5' NA(G or A) N(G or C) (A or T) (T or C)T(T or 
C)NA(A or G) (C or T)TT 3' [SEQ ID NO: 184] 

5' (A or T) (G or C)NAA(A or G) (T or C)TN(A or 

G)A(A or*G)(A or T) (G or C)N(T or C)T 3' [SEQ 
5 ID NO: 185] 

Based on M OTIF 12: 

5' (A or G or T) (A or T) (A or T) (C or T)TCNA(G 
or A) N(G or C) (A or T)N(T or C) (G or T)NA(G 
or A) NCC 3' [SEQ ID NO: 186] 

10 5' GGN(T or C)TN(A or C) (G or A)N(A or T) (G or 

L)N(T or C) TNGA 3' [SEQ ID NO: 187] 

Once a clone encoding a candidate RPS family gene 
is identified, it is then determined whether such gene is 
capable of conferring disease-resistance to a plant host 
15 using the methods described herein or other methods well 
known in the art of molecular plant pathology, 

A Biolistic Transient Expression Assay For Identification 
of Plant Resistance Genes 

We have developed a functional transient 

20 expression system capable of providing a rapid and 
broadly applicable method for identifying and 
characterizing virtually any gene for its ability to 
confer disease-resistance to a plant cell. In brief, the 
assay system involves delivering by biolistic 

25 transformation a candidate plant disease-resistance gene 
to a plant tissue sample (e.g., a piece of tissue from a 
leaf) and then evaluating the expression of the gene 
within the tissue by appraising the presence or absence 
of a disease-resistance response (e.g., the 

30 hypersensitive response) . This assay provides a method 
for identifying disease-resistance genes from a wide 
variety of plant speqies, including ones that are not 
amenable to genetic or transgenic studies. 
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The principle of the assay is depicted in the top 
portion of Figure 9. In general, plant cells carrying a 
mutation in the resistance gene of interest are utilized. 
Prior to biolistic transformation, the plant tissue is 
5 infiltrated with a phytopathogenic bacterium carrying the 
corresponding avirulence gene. In addition, a gene to be 
assayed for its resistance gene activity is co-introduced 
by biolistics with a .reporter gene. The expression of 
the cobombarded reporter gene serves as an indicator for 

10 viability of the transformed cells. Both genes are 

expressed under the control of a strong and constitutive 
promoter. If the gene to be assayed does not complement 
the resistance gene function, the plant cells do not 
undergo a hypersensitive response (HR) and, therefore, 

15 survive (Fig. 9, top panel, right). In this case, cells 
accumulate a large amount of the reporter gene product. 
If, on the other hand, a resistance gene is introduced, 
the plant cells recognize the signal from the avirulence- 
gene-carrying bacterium and undergo the HR because the 

20 expressed resistance gene product complements the 

function (Fig. 9, top panel, left). In this case, the 
plant cells do not have enough time to accumulate a large 
amount of reporter gene product before their death. 
Given the transformation efficiency estimated by a proper 

25 control (such as the uninfected half of the leaf) , 

measuring the accumulation of reporter gene product can 
thus indicate whether the gene to be assayed complements 
the resistance gene function. 

In one working example, we now demonstrate the 

30 effectiveness of the transient expression assay, using 
the bacterial avirulence gene avrRptZ and the 
corresponding Arabidopsis thai 1 ana resistance gene RPS2 
(Fig. 9, bottom panel). In brief, rps2 mutant leaves, 
preinfected with P. syrlngae carrying avrRpt2 , were co- 

35 bombarded with two plasmids, one of which contained the 
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RPS2 gene and the other the Escherichia coli uidA gene 
encoding 0 -glucuronidase (GDS; Jefferson et al., 1986, 
supra ) . Both the RPS2 and uidA genes are located 
downstream of the strong constitutive 35S promoter from 
5 cauliflower mosaic virus (Odell et al. , infra). If the 
35S-RPS2 construct complements the rps2 mutation, the 
transformed cells rapidly undergo programmed cell death 
in response to the P. syringae carrying avrRpt2 , and 
relatively little GUS activity accumulates. If the rps2 

10 mutation is not complemented, cell death does not occur 
and high levels of GUS activity accumulate. These 
differences in GUS activity are detected histochemically . 
Because the cDNA library used to identify RPS2 was 
constructed in the expression vector pKEx4tr, the 

15 RPS2 cDNA construct in pK£x4tr could be used directly in 
the transient assay. As shown in Fig. 11, pKEx4tr is a 
cDNA expression vector designed for the unidirectional 
insertion of cDNA inserts. Inserted cDNA is expressed 
under the control of the 355 cauliflower mosaic virus 

20 promoter. 

Our results are shown in Fig. 9, lower panel. In 
this experiment, we infected one side of a leaf of an 
rps2 mutant plant with P. syringae pv # phaseloicola 3121 
carrying avrRpt2 (Psp 3121/avrRpt2) . Psp 3121 is a weak 

25 pathogen of A. thaliana and Psp 3121/avrRpt2 can elicit 
an HR in a plant carrying the resistance gene RPS2 (e.g., 
a wild type plant) . Leaves of 5-week-old Arabidopsis 
plants were infiltrated with an appropriate bacterial 
suspension at a dose of 2 x 10 8 /ml by hand infiltration 

30 as described (Dong et al. , supra ) . After an incubation 
period (typically 2-4 hours) , the leaves were bombarded 
using a Bio-Rad PDS-1000/He apparatus (1100 psi) after 2- 
4 hr of infection. Gold particles were prepared 
according to the instructions of the manufacturer. For 

35 each bombardment, 1.4 of pKEx4tr-G, 0.1 pg of a 
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plasmid to be tested, and 0.5 mg of 1 Mm gold particles 
were used. After the bombardment, the leaves were' 
incubated in a humidity chamber at 22 °C for l day and 
then subjected to a histochemical GUS staining using 5- 
5 bromo-4-chloro-3-indiyl glucuronidase (X-Gluc) at 37 °C 
for 12 hr (Jefferson, 1987, supra 1 . This staining method 
with X-gluc stains cells expressing GUS enzyme with a 
blue color. The uninfected side of the leaf serves as a 
control for transformation efficiency of the leaf because 

10 in a single leaf, transformation efficiency (i.e., 

density of transformed cells) is similar on both sides of 
the leaf. If transformed cells on the infected side are 
rapidly killed, staining of the cells on the infected 
side is weaker than staining on the uninfected side. 

15 When the resistance gene RPS2 was co-introduced, the 

transformed cells on the infected side of the leaf showed 
much weaker staining than ones on the uninfected side 
(Fig. 10) . In contrast, when an unrelated gene was co- 
introduced, the transformed cells on the infected side 

20 showed similar staining intensity to ones on the 
uninfected side (Fig. 10) . 

Thus, as summarized in the Table 2, 35S-RPS4 (cDNA 
4), but not cDNA-5 otf cDNA-6, complemented the HR 
phenotype of rps2-101C. (See Figure 1) 

25 Table 2 

Response 

Gene Tested (Decreased GUSActivitv) a 

AGUS (35S-uidA containing 
internal uidA deletion) 

30 CDNA-5 (35S-ABU) 

cDNA-4 (35S-RPS2) + 

cDNA-6 (35S-CK1) 
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a When decreased GUS activity was observed on the 
infiltrated side of the leaf, the response was scored as 
plus (Fig. 10). 



Both RPS2 cDNA-4 clones 4 and 11, corresponding to the 
5 two RPS2 different transcript sizes, complemented the 
rps2 mutant phenotype, indicating that both transcripts 
encode a functional product. Moreover, 35S-RPS2 also 
complemented mutants rps2-102C, rps2-101N, and rps2-201C, 
further confirming that the rps2-101C 9 rps2-102C , rps2- 
10 201C and rps2-101N mutations are all allelic. In short, 
the cloned RPS2 gene complemented the rps2 mutation in 
this transient expression assay, and complementation by 
RPS2 was observed in all four available rps2 mutant 
stains. 

15 Next we used the transient assay system to test 

the specificity of the cloned RPS2 gene for an avrRpt2- 
generated signal (i.e., the "gene-for-gene" specificity 
of a P. syringae avirulence gene and a corresponding A. 
thaliana resistance gene (avrRpml and RPM1, 

20 respectively)). This experiment involved the use of an 
rps2-101 rpml double mutant that cannot mount an HR when 
challenged with P. syringae carrying avrRpt2 or the 
unrelated avirulence gene avrRpml (Debener et al., Plant 
Journal 1:289-302, 1991). As summarized in Table 3, 

25 complementation of the rps2 mutant phenotype by 35S-RPS2 
was only observed in the presence of a signal generated 
by avrRpt2, indicating that RPS2 does not simply 
sensitize the plant resistance response in a nonspecific 
manner. 
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Construct Cobombarded 
avr Gene with 35S-uidA Response* 

None (vector only) AGUS b 

5 avrRPt2 AGUS 

avrflpnrt AGUS 

None (vector only) 35S-RPS2 

avrRpt2 . 35S-RPS2 + 

avrRpml 35S-RPS2 

10 a When decreased GUS activity was observed on the 

infiltrated side of the leaf, the response was scored as 
plus. (Figure 10 f panel B) 

b AGUS is 355-uidA containing an internal deletion 
in the uldA gene. 

15 Also as shown in Table 3, the RPS2 gene complemented the 
mutant phenotype when leaves were infected with Psp 
3121/avrRpt2 but not with Psp 3121/avrRpml. Therefore, 
the RPS2 gene complemented only the rps2 mutation; it did 
not the rpml mutation. 

20 We have also discovered that overexpression of an 

rps gene family member, e.g., rps2 but not other genes, 
in the transient assay leads to apparent cell death, 
obviating the need to know the corresponding avirulence 
gene for a putative resistance gene that has been cloned. 

25 Using this assay, any plant disease-resistance 

gene may be identified from a cDNA expression library. 
In one particular example, a cDNA library is constructed 
in an expression vector and then introduced as described 
herein into a plant cultivar or its corresponding mutant 

30 plant lacking the resistance gene of interest. 

Preferably, the cDNA library is divided into small pools, 
and each pool co-introduced with a reporter gene. If a 
pool contains a resistance gene clone (i.e., the pool 
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"complements" the resistance gene function) , the positive 
pool is divided into smaller pools and the same procedure 
is repeated until identification of a single positive 
clone is ultimately achieved. This approach facilitates 
5 the cloning of any resistance gene of interest without 
genetic crosses or the creation of transgenics. 

We now describe the cloning of another member of 
the RPS gene family, the Prf gene of tomato. 

The initial step for the cloning of the Prf gene 

10 came from classical genetic analysis which showed that 
Prf was tightly linked to the tomato Pto gene (Salmeron 
et al.. The Plant Cell 6:511-520, 1994). This prompted 
construction of a cosmid contig of 200 kb in length which 
encompassed the Pto locus. DNA probes from this contig 

15 were used to screen a tomato cDNA library constructed 
using tomato leaf tissue that had been infected with Pst 
expressing the avrPto avirulence gene as source material. 
Two classes of cDNAs were identified based on cross- 
hybridization of clones to each other. While one class 

20 corresponded to members of the Pto gene family, the other 
class displayed no hybridization to Pto family members. 
Taking the assumption (based on the aforementioned 
genetic analysis) that Prf might reside extremely close 
to the Pto gene, cDNAs from the second class were 

25 analyzed further as candidate Prf clones. These clones 
were hybridized to filters containing DNAs from six 
independent prf mutant lines that had been isolated by 
diepoxybutane or fast neutron treatment. In one of the 
fast neutron mutants, the cDNA probe revealed a 1.1 kb 

30 deletion in the genomic DNA, suggesting that the cDNA 
clone might in fact represent Prf. Wild-type DNA 
corresponding to the deletion was cloned from Prf /Prf 
tomato. A 5 kb region was sequenced and found to 
potentially encode a protein containing P-loop and 

35 leucine-rich repeat motifs, supporting the hypothesis 
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that this DNA encoded Prf ♦ The corresponding DNA was 
cloned and sequenced from the fast neutron mutant plant. 
Sequencing this DNA confirmed the mutation to be a simple 
1.1 kb deletion excising DNA between the potential P-loop 
5 and leucine-rich repeat coding regions. The gene is 
expressed based on RT-PCR analysis which has shown that 
an mRNA is transcribed from this region. The identity of 
the cloned DNA as the Prf gene is based on both the 
existence of the deletion mutation and the predicted 
10 protein sequence, which reveals patches of strong 
similarity to other cloned disease resistance gene 
products throughout the amino-terminal half (as described 
herein) . A partial sequence of the Prf gene is shown in 
Figure 12. 

15 RPS Expression In Transgenic Plant Cells and Plants 
The expression of the RPS2 genes in plants 
susceptible to pathogens carrying avrRpt2 is achieved by 
introducing into a plant a DNA sequence containing the 
RPS2 gene for expression of the Rps2 polypeptide. A 

20 number of vectors suitable for stable transfection of 

plant cells or for the establishment of transgenic plants 
are available to the public; such vectors are described 
in, e.g., Pouwels et al., Cloning Vectors: A Laboratory 
Manual, 1985, Supp. 1987); Weissbach and Weissbach, 

25 Methods for Plant Molecular Biology, Academic Press, 

1989; and Gelvin et al., Plant Molecular Biology Manual, 
Kluwer Academic Publishers, 1990. Typically, plant 
expression vectors include (1) one or more cloned plant 
genes under the transcriptional control of 5' and 3' 

30 regulatory sequences and (2) a dominant selectable 

marker. Such plant expression vectors may also contain, 
if desired, a promoter regulatory region (e.g. , a 
regulatory region controlling inducible or constitutive, 
environmentally- or developmentally-regulated, or cell- 
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or tissue-specific expression) , a transcription 
initiation start sitQ, a ribosome binding site, an RNA 
processing signal, a transcription termination site, 
and/or a polyadenylation signal* 
5 An example of a useful plant promoter which could 

be used to express a plant resistance gene according to 
the invention is a caulimovirus promoter, e.g., the 
cauliflower mosaic virus (CaMV) 35S promoter. These 
promoters confer high levels of expression in most plant 

10 tissues, and the activity of these promoters is not 
dependent on virtually encoded proteins. CaMV is a 
source for both the 35S and 19S promoters* In most 
tissues of transgenic plants, the CaMV 35S promoter is a 
strong promoter (see, e.g., Odel et al., Nature 313:810, 

15 (1985)). The CaMV promoter is also highly active in 
monocots (see, e.g., Dekeyser et al., Plant Cell 2:591, 
(1990); Terada and Shimamoto, Mol. Gen. Genet. 220:389, 
(1990)). 

Other useful plant promoters include, without 

20 limitation, the nonpaline synthase promoter (An et al., 
Plant Physiol. 88:547, (1988)) and the octopine synthase 
promoter (Fromm et al., Plant Cell 1:977, (1989)). 

For certain applications, it may be desirable to 
produce the RPS2 gene product or the avrRpt2 gene product 

25 in an appropriate tissue, at an appropriate level, or at 
an appropriate developmental time- Thus, there are a 
variety of gene promoters, each with its own distinct 
characteristics embodied in its regulatory sequences, 
shown to be regulated in response to the environment, 

30 hormones, and/or developmental cues. These include gene 
promoters that are responsible for (1) heat-regulated 
gene expression (see/ e.g., Callis et al., Plant Physiol. 
88: 965, (1988)), (2) light -regulated gene expression 
(e.g., the pea rbcS-3A described by Kuhlemeier et al., 

35 Plant Cell 1: 471, (1989); the maize rJbcS promoter 
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described by Schaffner and Sheen, Plant Cell 3: 997, 
(1991) ; or the chlorophyll a/b-binding protein gene found 
in pea described by Simpson et al., EMBO J. 4: 2723, 
(1985)), (3) hormone-regulated gene expression (e.g., the 
5 abscisic acid responsive sequences from the Em gene of 
wheat described Harcotte et al., Plant Cell 1:969, 
(1989)), (4) wound-induced gene expression (e.g., of wunl 
described by Siebertz et al., Plant Cell 1: 961, (1989)), 
or (5) organ-specific gene expression (e.g., of the 

10 tuber-specific storage protein gene described by Roshal 
et al., EMBO J* 6:1155, (1987); the 23-kDa zein gene from 
maize described by Schernthaner et al., EMBO J. 7: 1249, 
(1988) ; or the French bean B-phaseolin gene described by 
Bustos et al., Plant Cell 1:839, (1989)). 

15 Plant expression vectors may also optionally 

include RNA processing signals, e.g, introns, which have 
been shown to be important for efficient RNA synthesis 
and accumulation (Callis et al., Genes and Dev. 1: 1183, 
(1987)). The location of the RNA splice sequences can 

20 influence the level of transgene expression in plants. 
In view of this fact, an intron may be positioned 
upstream or downstream of an Rps2 polypeptide-encoding 
sequence in the transgene to modulate levels of gene 
expression. 

25 In addition to the aforementioned 5' regulatory 

control sequences, the expression vectors may also 
include regulatory control regions which are generally 
present in the 3' regions of plant genes (Thornburg et 
al., Proc. Natl Acad. Sci USA 84: 744, (1987); An et 

30 al., Plant Cell 1: 115, (1989)). For example, the 3' 
terminator region may be included in the expression 
vector to increase stability of the mRNA. One such 
terminator region may be derived from the PI-II 
terminator region of potato. In addition, other commonly 
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used terminators are derived from the octopine or 
nopaline synthase signals. 

The plant expression vector also typically 
contains a dominant selectable marker gene used to 
5 identify the cells that have become transformed* Useful 
selectable marker genes for plant systems include genes 
encoding antibiotic resistance genes, for example, those 
encoding resistance to hygromycin, kanamycin, bleomycin, 
G418, streptomycin or spectinomycin. Genes required for 

10 photosynthesis may also be used as selectable markers in 
photosynthetic-def icient strains. Finally, genes 
encoding herbicide resistance may be used as selectable 
markers; useful herbicide resistance genes include the 
bar gene encoding the enzyme phosphinothricin 

15 acetyl transferase, which confers resistance to the broad 
spectrum herbicide Basta® (Hoechst AG, Frankfurt, 
Germany) . 

Efficient use of selectable markers is facilitated 
by a determination of the susceptibility of a plant cell 

20 to a particular selectable agent and a determination of 
the concentration of this agent which effectively kills 
most, if not all, of the transformed cells. Some useful 
concentrations of antibiotics for tobacco transformation 
include, e.g., 75-100 /i9/*l (kanamycin) , 20-50 pg/ml 

25 (hygromycin) , or 5-10 ng/ml (bleomycin) . A useful 

strategy for selection of transformants for herbicide 
resistance is described, e.g., in Vasil I.K., Cell 
Culture and Somatic Cell Genetics of Plants, Vol I, II, 
III Laboratory Procedures and Their Applications Academic 

30 Press, New York, 1984. 

It should be . readily apparent to one skilled in 
the field of plant molecular biology that the level of 
gene expression is dependent not only on the combination 
of promoters, RNA processing signals and terminator 
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Increase the levels of gene expression* 

The above exemplary techniques may be used for the 
expression of any gene in the RPS family. 

5 Plant Transformation 

Upon construction of the plant expression vector, 
several standard methods are known for introduction of 
the recombinant genetic material into the host plant for 
the generation of a transgenic plant. These methods 

10 include (1) Agrobacterium-mediated transformation (A. 
tumefaci ens or A. rhizogenes) (see, e.g., Lichtenstein 
and Fuller In: Genetic Engineering, vol 6, PWJ Rigby, ed, 
London, Academic Press, 1987; and Lichtenstein, CP. , and 
Draper, J,. In: DNA Cloning, Vol II, D.M. Glover, ed, 

15 Oxford, IRI Press, 1985), (2) the particle delivery 

system (see, e.g., Gordon-Kamm et al., Plant Cell 2:603, 
(1990); or BioRad Technical Bulletin 1687, supra), (3) 
microinjection protocols (see, e.g., Green et al., Plant 
Tissue and Cell Culture, Academic Press, New York, 1987) , 

20 (4) polyethylene glycol (PEG) procedures (see, e.g., 
Draper et al., Plant Cell Physiol 23:451, (1982); or 
e.g., Zhang and Wu, Theor. Appl. Genet. 76:835, (1988)), 
(5) liposome-mediated DNA uptake (see, e.g., Freeman et 
al., Plant Cell Physiol 25: 1353, (1984)), (6) 

25 electroporation protocols (see, e.g., Gelvin et al supra : 
Dekeyser et al. supra : or Fromm et al Nature 319: 791, 
(1986)), and (7) the vortexing method (see, e.g., Kindle, 
K. , Proc. Natl. Acad. Sci., USA 87:1228, (1990)). 
The following is an example outlining an 

30 Agrobacterium-mediated plant transformation. The general 
process for manipulating genes to be transferred into the 
genome of plant cells is carried out in two phases. 
First, all the cloning and DNA modification steps are 
done in E. coll, and the plasmid containing the gene 
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construct of interest is transferred by conjugation into 
AgroJbacteriujn. Second, the resulting Agrobacterium 
strain is used to transform plant cells. Thus, for the 
generalized plant expression vector, the plasmid contains 
5 an origin of replication that allows it to replicate in 
AgroJbacteriujn and a high copy number origin of 
replication functional in E. coll. This permits facile 
production and testing of transgenes in E.coli prior to 
transfer to AgroJbacteriujn for subsequent introduction 

10 into plants. Resistance genes can be carried on the 
vector, one for selection in bacteria, e.g., 
streptomycin, and the other that will express in plants, 
e.g., a gene encoding for kanamycin resistance or an 
herbicide resistance gene. Also present are restriction 

15 endonuclease sites for the addition of one or more 
transgenes operably linked to appropriate regulatory 
sequences and directional T-DNA border sequences which, 
when recognized by the transfer functions of 
Agrobacterium, delimit the region that will be 

20 transferred to the plant. 

In another example, plant cells may be transformed 
by shooting into the cell tungsten microprojectiles on 
which cloned DNA is precipitated. In the Biolistic 
Apparatus (Bio-Rad, Hercules, CA) used for the shooting, 

25 a gunpowder charge (22 caliber Power Piston Tool Charge) 
or an air-driven blast drives a plastic macroprojectile 
through a gun barrel. An aliquot of a suspension of 
tungsten particles on which DNA has been precipitated is 
placed on the front of the plastic macroprojectile. The 

30 latter is fired at an acrylic stopping plate that has a 
hole through it that is too small for the macroprojectile 
to go through. As a result, the plastic macroprojectile 
smashes against the stopping plate and the tungsten 
microprojectiles continue toward their target through the 

35 hole in the plate. For the instant invention the target 
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can be any plant cell, tissue, seed, or embryo. The DNA 
introduced into the cell on the microprojectiles becomes 
integrated into either the nucleus or the chloroplast. 

Transfer and expression of transgenes in plant 
5 cells is now routine practice to those skilled in the 
art. It has 

become a major tool to carry out gene expression studies 
and to attempt to obtain improved plant varieties of 
agricultural or commercial interest. 

10 Transgenic Plant Regeneration 

Plant cells transformed with a plant expression 
vector can be regenerated, e.g. , from single cells, 
callus tissue or leaf discs according to standard plant 
tissue culture techniques. It is well known in the art 

15 that various cells, tissues and organs from almost any 
plant can be successfully cultured to regenerate an 
entire plant; such techniques are described, e.g., in 
Vasil supra : Green et.al., Weissbach and 

Weissbach, supra : and Gelvin et al., 

20 In one possible example, a vector carrying a 

selectable marker gene (e.g., kanamycin resistance), a 
cloned RPS2 gene under the control of its own promoter 
and terminator or, if desired, under the control of 
exogenous regulatory sequences such as the 35S CaMV 

25 promoter and the nopaline synthase terminator is 

transformed into Agrobacterium. Transformation of leaf 
tissue with vector-containing Agrobacterium is carried 
out as described by Horsch et al. (Science 227: 1229, 
(1985) ) . Putative transf ormants are selected after a few 

30 weeks (e.g., 3 to 5 weeks) on plant tissue culture media 
containing kanamycin (e.g. 100 /xg/ml) . Kanamycin- 
resistant shoots are then placed on plant tissue culture 
media without hormones for root initiation. Kanamycin- 
resistant plants are then selected for greenhouse growth. 
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If desired, seeds from self-fertilized transgenic plants 
can then be sowed in a soil- less media and grown irr a 
greenhouse. Kanamyc in-resistant progeny are selected by 
sowing surfaced sterilized seeds on hormone- free 
5 kanamyc in-containing media. Analysis for the integration 
of the transgene is accomplished by standard techniques 
(see, e.g., Ausubel et al. supra ; Gelvin et al. supra ) . 

Transgenic plants expressing the selectable marker 
are then screened for .transmission of the transgene DNA 

10 by standard immunoblot and DNA and RNA detection 
techniques. Each positive transgenic plant and its 
transgenic progeny are unique in comparison to other 
transgenic plants established with the same transgene. 
Integration of the transgene DNA into the plant genomic 

15 DNA is in most cases random and the site of integration 
can profoundly effect the levels, and the tissue and 
developmental patterns of transgene expression. 
Consequently, a number of transgenic lines are usually 
screened for each transgene to identify and select plants 

20 with the most appropriate expression profiles. 

Transgenic lines are evaluated for levels of 
transgene expression. Expression at the RNA level is 
determined initially to identify and quantitate 
expression-positive plants. Standard techniques for RNA 

25 analysis are employed and include PCR amplification 

assays using oligonucleotide primers designed to amplify 
only transgene RNA templates and solution hybridization 
assays using transgene-specif ic probes (see, e.g., . 
Ausubel et al. , supra ) • The RNA-positive plants are then 

30 analyzed for protein expression by Western immunoblot 

analysis using Rps2 polypeptide- specific antibodies (see, 
e.g., Ausubel et al. , supra ) . In addition, in situ 
hybridization and immunocytochemistry according to 
standard protocols can be done using transgene-specif ic 
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nucleotide probes and antibodies, respectively, to 
localize sites of expression within transgenic tissue. 

Once the Rps2 polypeptide has been expressed in 
any cell or in a transgenic plant (e.g., as described 
5 above) , it can be isolated using any standard technique, 
e.g., affinity chromatography. In one example, an anti- 
Rps2 antibody (e.g., produced as described in Ausubel et 
al., or by any standard technique) may be attached 

to a column and used to isolate the polypeptide. Lysis 

10 and fractionation of Rps 2 -producing cells prior to 
affinity chromatography may be performed by standard 
methods (see, e.g., Ausubel et al., supra ) . Once 
isolated, the recombinant polypeptide can, if desired, be 
further purified, e.g., by high performance liquid 

15 chromatography (see, e.g. , Fisher, Laboratory Techniques 
In Biochemistry And Molecular Biology, Work and Burdon, 
eds., Elsevier, 1980). 

These general techniques of polypeptide expression 
and purification can also be used to produce and isolate 

20 useful Rps2 fragments or analogs. 

Antibody Production 

Using a polypeptide described above (e.g., the 
recombinant protein or a chemically synthesized RPS 
peptide based on its deduced amino acid sequence) , 

25 polyclonal antibodies which bind specifically to an RPS 
polypeptide may be produced by standard techniques (see, 
e.g., Ausubel et al., supra ) and isolated, e.g., 
following peptide antigen affinity chromatography. 
Monoclonal antibodies can also be prepared using standard 

30 hybridoma technology (see, e.g., Kohler et al., Nature 
256: 495, 1975; Kohler et al., Eur. J*. Immunol. 6: 292, 
1976; Hammer ling et al., in Monoclonal Antibodies and T 
Cell Hybridomas, Elsevier, M.Y., 1981; and Ausubel et 
al., supra ) . 
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Once produced, polyclonal or monoclonal antibodies 
are tested for specific RSP polypeptide recognition by 
Western blot or immunoprecipitation analysis (by methods 
described in Ausubel et al., supra ) . Antibodies which 
5 specifically recognize a EPS polypeptide are considered 
to be useful in the invention; such antibodies may be 
used, e.g., for screening recombinant expression 
libraries as described in Ausubel et al., supra . 
Exemplary peptides (derived from Rps2) for antibody 
10 production include: 

LKFSYDNLESDLL* [SEQ ID NO: 188] 

GVYGPGGVGKTTLMQS [SEQ ID NO: 189] 

GGLPLALITLGGAM [SEQ ID NO: 190] 

Use 

15 Introduction of RPS2 into a transformed plant cell 

provides for resistance to bacterial pathogens carrying 
the avrRpt2 avirulence gene. For example, transgenic 
plants of the instant invention expressing RPS2 might be 
used to alter, simply and inexpensively, the disease 

20 resistance of plants normally susceptible to plant 
pathogens carrying the avirulence gene, avrRpt2. 

The invention also provides for broad-spectrum 
pathogen resistance by mimicking the natural mechanism of 
host resistance. First, the RPS2 transgene is expressed 

25 in plant cells at a sufficiently high level to initiate 
the plant defense response constitutively in the absence 
of signals from the pathogen. The level of expression 
associated with plant defense response initiation is 
determined by measuring the levels of defense response 

30 gene expression as described in Dong et al., supra . 

Second, the EPS2 transgene is expressed by a controllable 
promoter such as a tissue-specific promoter, cell-type 
specific promoter or by a promoter that is induced by an 
external signal or agent thus limiting the temporal and 
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tissue expression of 'a defense response. Finally, the 
RPS2 gene product is co-expressed with the avrRpt? gene 
product. The BPS2 gene is expressed by its natural 
promoter, by a constitutively expressed promoter such as 
5 the CaMV 35S promoter, by a tissue-specific or cell-type 
specific promoter, or by a promoter that is activated by 
an external signal or agent. Co-expression of RPS2 and 
avrRpt2 will mimic the production of gene products 
associated with the initiation of the plant defense 

10 response and provide resistance to pathogens in the 
absence of specific resistance gene-avirulence gene 
corresponding pairs in the host plant and pathogen. 

The invention also provides for expression in 
plant cells of a nucleic acid having the sequence of Fig. 

15 2 or the expression of a degenerate variant thereof 
encoding the amino acid sequence of open reading frame 
"a" of Fig. 2. 

The invention further provides for the isolation 
of nucleic acid sequences having about 50% or greater 

20 sequence identity to EPS2 by using the RPS2 sequence of 
Fig. 2 or a portion thereof greater than 9 nucleic acids 
in length, and preferably greater than about 18 nucleic 
acids in length as a probe. Appropriate reduced 
hybridization stringency conditions are utilized to 

25 isolate DNA sequences having about 50% or greater 
sequence identity to the RPS2 sequence of Fig. 2. 

Also provided by the invention are short conserved 
regions characteristic of RPS disease resistance genes. 
These conserved regions provide oligonucleotide sequences 

30 useful for the production of hybridization probes and PCR 
primers for the isolation of other plant disease- 
resistance genes. 

Both the RPS 2 gene and related RPS family genes 
provide disease resistance to plants, especially crop 

35 plants, most especially important crop plants such as 
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tomato, pepper, maize, wheat, rice and legumes such as 
soybean and bean, or any plant which is susceptible to 
pathogens carrying an avirulence gene, e.g., the avrRpt2 
avirulence gene. Such pathogens include, but are not 
5 limited to, Pseudomonas syringae strains. 

The invention also includes any biologically 
active fragment or analog of an Rps2 polypeptide. By 
"biologically active" is meant possessing any in vivo 
activity which is characteristic of the Rps2 polypeptide 

10 shown in Fig. 2. A useful Rps2 fragment or Rps2 analog 
is one which exhibits a biological activity in any 
biological assay for disease resistance gene product 
activity, for example, those assays described by Dong et 
al. (1991), supra : Yu et al* (1993) supra : Kunkel et al. 

15 (1993) supra : and Whale n et al. (1991). In particular, a 
biologically active Rps2 polypeptide fragment or analog 
is capable of providing substantial resistance to plant 
pathogens carrying the avrRpt2 avirulence gene. By 
substantial resistance is meant at least partial 

20 reduction in susceptibility to plant pathogens carrying 
the avrRpt2 gene. 

Preferred analogs include Rps2 polypeptides (or 
biologically active fragments thereof) whose sequences 
differ from the wild-type sequence only by conservative 

25 amino acid substitutions, for example, substitution of 
one amino acid for another with similar characteristics 
(e.g., valine for glycine, arginine for lysine, etc.) or 
by one or more non-conservative amino acid substitutions, 
deletions, or insertions which do not abolish the 

30 polypeptide's biological activity. 

Analogs can differ from naturally occurring Rps2 
polypeptide in amino acid sequence or can be modified in 
ways that do not involve sequence, or both. Analogs of 
the invention will generally exhibit at least 70%, 

35 preferably 80%, more preferably 90%, and most preferably 
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95% or even 99%, homology with a segment of 20 amino acid 
residues, preferably 40 amino acid residues, or more 
preferably the entire sequence of a naturally occurring 
Rps2 polypeptide sequence. 
5 Alterations in primary sequence include genetic 

variants, both natural and induced* Also included are 
analogs that include residues other than naturally 
occurring L-amino acids, e.g., D-amino acids or non- 
naturally occurring or synthetic amino acids, e.g., p or 

10 y amino acids. Also included in the invention are Rps2 
polypeptides modified by in vivo chemical derivatization 
of polypeptides, including acetylation, methylation, 
phosphorylation, carboxylation, or glycosylation. 
In addition to substantially full-length 

15 polypeptides, the invention also includes biologically 
active fragments of the polypeptides. As used herein, 
the term "fragment", as applied to a polypeptide, will 
ordinarily be at least 20 residues, more typically at 
least 40 residues, and preferably at least 60 residues in 

20 length. Fragments of Rps2 polypeptide can be generated 
by methods known to those skilled in the art. The 
ability of a candidate fragment to exhibit a biological 
activity of Rps2 can be assessed by those methods 
described herein. Also included in the invention are 

25 Rps2 polypeptides containing residues that are not 

required for biological activity of the peptide, e.g., 
those added by alternative mRNA splicing or alternative 
protein processing events. 

Other embodiments are within the following claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Ausubel, Frederick M. 

Staskawicz, Brian J, 
Brent, Andrew F. 
Dahlbeck, Douglas 
Katagiri, Fumiaki 
Kunkel, Barbara N. 
Mindrinos, Michael N. 
Yu, Guo-Liang 

(ii) TITLE OF INVENTION: RPS2 GENE AND USES THEREOF 

(iii) NUMBER OF SEQUENCES: 201 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fish & Richardson 

(B) STREET: 225 Franklin Street Suite 3100 

(C) CITY: Boston 

(D) STATE: MA 
<E) COUNTRY: USA 
(F) ZIP: 02110-2904 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.30B 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/227,360 

(B) FILING DATE: 13-APR-1994 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION I 

(A) NAME: Clark, Paul T. 

(B) REGISTRATION NUMBER: 30,162 

(C) REFERENCE /DOCKET NUMBER: 00786/230001 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 542-5070 

(B) TELEFAX: (617) 542-8906 

(C) TELEX: 100254 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2903 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
AAGTAAAAGA AAGAGCGAGA AATCATCGAA ATGGATTTCA TCTCATCTCT TATCGTTGGC 
TGTGCTCAGG TGTTGTGTGA ATCTATGAAT ATGGCGGAGA GAAGAGGACA TAAGACTGAT 



60 
120 
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CTTAGACAAG 


CCATCACTGA 


TCTTGAAACA 


GCCATCGGTG 


ACTTGAAGGC 


CATACGTGAT 


180 


GACCTGACTT 


TACGGATCCA 


ACAAGACGGT 


CTAGAGGGAC 


GAAGCTGCTC 


AAATCGTGCC* 


240 


AGAGAGTGGC 


TTAGTGCGGT 


GCAAGTAACG 


GAGACTAAAA 


CAGCCCTACT 


TTTAGTGAGG 


300 


TTTAGGCGTC 


GGGAACAGAG 


GACGCGAATG 


AGGAGGAGAT 


ACCTCAGTTG 


TTTCGGTTGT 


360 


GCCGACTACA 


AACTGTGCAA 


GAAGGTTTCT 


GCCATATTGA 


AGAGCATTGG 


TGAGCTGAGA 


420 


GAACGCTCTG 


AAGCTATCAA 


AACAGATGGC 


GGGTCAATTC 


AAGTAACTTG 


TAGAGAGATA 


480 


CCCATCAAGT 


CCGTTGTCGG 


AAATACCACG 


ATGATGGAAC 


AGGTTTTGGA 


ATTTCTCAGT 


540 


GAAGAAGAAG 


AAAGAGGAAT 


CATTGGTGTT 


TATGGACCTG 


GTGGGGTTGG 


GAAGACAACG 


600 


TTAATGCAGA 


GCATTAACAA 


CGAGCTGATC 


ACAAAAGGAC 


ATCAGTATGA 


TGTACTGATT 


660 


TGGGTTCAAA 


TGTCCAGAGA 


ATTCGGCGAG 


TGTACAATTC 


AGCAAGCCGT 


TGGAGCACGG 


720 


TTGGGTTTAT 


CTTGGGACGA 


GAAGGAGACC 


GGCGAAAACA 


GAGCTTTGAA 


GATATACAGA 


780 


GCTTTGAGAC 


AGAAACGTTT 


CTTGTTGTTG 


CTAGATGATG 


TCTGGGAAGA 


GATAGACTTG 


840 


GAGAAAACTG 


GAGTTCCTCG 


ACCTGACAGG 


GAAAACAAAT 


GCAAGGTGAT 


GTTCACGACA 


900 


CGGTCTATAG 


CATTATGCAA 


CAATATGGGT 


GCGGAATACA 


AGTTGAGAGT 


GGAGTTTCTG 


960 


GAGAAGAAAC 


ACGCGTGGGA 


GCTGTTCTGT 


AGTAAGGTAT 


GGAGAAAAGA 


TCTTTTAGAG 


1020 


TCATCATCAA 


TTCGCCGGCT 


CGCGGAGATT 


ATAGTGAGTA 


AATGTGGAGG 


ATTGCCACTA 


1080 


GCGTTGATCA 


CTTTAGGAGG 


AGCCATGGCT 


CATAGAGAGA 


CAGAAGAAGA 


GTGGATCCAT 


1140 


GCTAGTGAAG 


TTCTGACTAG 


ATTTCCAGCA 


GAGATGAAGG 


GTATGAACTA 


TGTATTTGCC 


1200 


CTTTTGAAAT 


TCAGCTACGA 


CAACCTCGAG 


AGTGATCTGC 


TTCGGTCTTG 


TTTCTTGTAC 


1260 


TGCGCTTTAT 


TCCCAGAAGA 


ACATTCTATA 


GAGATCGAGC 


AGCTTGTTGA 


GTACTGGGTC 


1320 


GGCGAAGGGT 


TTCTCACCAG 


CTCCCATGGC 


GTTAACACCA 


TTTACAAGGG 


ATATTTTCTC 


1380 


ATTGGGGATC 


TGAAAGCGGC 


ATGTTTGTTG 


GAAACCGGAG 


ATGAGAAAAC 


AGAGGTGAAG 


1440 


ATGCATAATG 


TGGTCAGAAG 


CTTTGCATTG 


TGGATGGCAT 


CTGAACAGGG 


GACTTATAAG 


1500 


GAGCTGATCC 


TAGTTGAGCC 


TAGCATGGGA 


CATACTGAAG 


CTCCTAAAGC 


AGAAAACTGG 


1560 


CGACAAGCGT 


TGGTGATCTC 


ATTGTTAGAT 


AACAGAATCC 


AGACCTTGCC 


TGAAAAACTC 


1620 




AACTGACAAC 


ACTGATG CT C 


CAACAGAACA 


CCTCTTTGAA. 


G AAG ATT CCA 


1680 


ACAGGGTTTT 


TCATGCATAT 


GCCTGTTCTC 


AGAGTCTTGG 


ACTTGTCGTT 


CACAAGTATC 


1740 


ACTGAGATTC 


CGTTGTCTAT 


CAAGTATTTG 


GTGGAGTTGT 


ATCATCTGTC 


TATGTCAGGA 


1800 


ACAAAGATAA 


GTGTATTGCC 


ACAGGAGCTT 


GGGAATCTTA 


GAAAACTGAA 


GCATCTGGAC 


I860 


CTACAAAGAA 


CTCAGTTTCT 


TCAGACGATC 


CCACGAGATG 


CCATATGTTG 


GCTGAGCAAG 


1920 


CTCGAGGTTC 


TGAACTTGTA 


CTACAGTTAC 


GCCGGTTGGG 


AACTGCAGAG 


CTTTGGAGAA 


1980 


GATGAAGCAG 


AAGAACTCGG 


ATTCGCTGAC 


TTGGAATACT 


TGGAAAACCT 


AACCACACTC 


2040 
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GGTATCACTG 


TTCTCTCATT 


GGAGACCCTA 


AAAACTCTCT 


TCGAGTTCGG 


TGCTTTGCAT 


2100 


AAACATATAC 


AGCATCTCCA 


CGTTGAAGAG 


TGCAATGAAC 


TCCTCTACTT 


CAATCTCCCA* 


2160 


TCACTCACTA 


ACCATGGCAG 


GAACCTGAGA 


AGACTTAGCA 


TTAAAAGTTG 


CCATGACTTG 


2220 


GAGTACCTGG 


TCACACCCGC 


AGATTTTGAA 


AATGATTGGC 


TTCCGAGTCT 


AGAGGTTCTG 


2280 


ACGTTACACA 


GCCTTCACAA 


CTTAACCAGA 


GTGTGGGGAA 


ATTCTGTAAG 


CCAAGATTGT 


2340 


CTGCGGAATA 


TCCGTTGCAT 


AAACATTTCA 


CACTGCAACA 


AGCTGAAGAA 


TGTCTCATGG 


2400 


GTTCAGAAAC 


TCCCAAAGCT 


AGAGGTGATT 


GAACTGTTCG 


ACTGGAGAGA 


GATAGAGGAA 


2460 


TTGATAAGCG 


AACACGAGAG 


TCCATCCGTC 


GAAGATCCAA 


CATTGTTCCC 


AAGCCTGAAG 


2520 




CTAGGGATCT 


GCCAGAACTA 


AAGAGCATCC 


TCCCATCTCG 


ATTTTCATTC 


2580 


CAAAAAGTTG 


AAACATTAGT 


CATCACAAAT 


TGCCCCAGAG 


TTAAGAAACT 


GCCGTTTCAG 


2640 


GAGAGGAGGA 


CCGAGATGAA 


CTTGCCAACA 


GTTTATTGTG 


AGGAGAAATG 


GTGGAAAGCA 


2700 


CTGGAAAAAG 


ATCAACCAAA 


CGAAGAGCTT 


TGTTATTTAC 


CGCGCTTTGT 


TCCAAATTGA 


2760 


TATAAGAGCT 


AAGAGCACTC 


TGTACAAATA 


TGTCCATTCA 


TAAGTAGCAG 


GAAGCCAGGA 


2820 


AGGTTGTTCC 


AGTGAAGTCA 


TCAACTTTCC 


ACATAGCCAC 


AAAACTAGAG 


ATTATGTAAT 


2880 


CATAAAAACC 


AAACTATCCG 


CGA 








2903 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 885 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Lvs Lvs Glu Arg Glu lie lie Glu Met Asp Phe lie Ser Ser Leu lie 
15 10 15 

Val Gly Cya Ala Gin Val Leu Cys Glu Ser Met Asn Met Ala Glu Arg 
20 25 30 

Arg Gly His Lys Thr Asp Leu Arg Gin Ala lie Thr Asp Leu Arg lie 
35 40 45 

Gin Gin Asp Gly Leu Glu Gly Arg Ser Cys Ser Asn Arg Ala Arg Glu 
50 55 60 

Trp Leu Ser Ala Val Gin Val Thr Glu Thr Lys Thr Ala Leu Leu Leu 
65 70 75 80 

Val Arg Phe Arg Arg Arg Glu Gin Arg Thr Arg Met Arg Arg Arg Tyr 

85 90 95 
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Leu Ser Cys Phe Gly Cys Ala Asp Tyr 
100 105 

Ala lie Leu Lys Ser He Gly Glu Leu 
115 120 

Lys Thr Asp Gly Gly Ser He Gin Val 
130 135 

Lys Ser Val Val Gly Asn Thr Thr Met 
145 150 

Leu Ser Glu Glu Glu Glu Arg Gly He 

165 

Gly Val Gly Lys Thr Thr Leu Met Gin 
180 185 

Thr Lys Gly His Gin Tyr Asp Val Leu 
195 200 



Lys Leu Cys Lys Lys Val Ser 

110 

Arg Glu Arg Ser Glu Ala He 
125 

Thr Cys Arg Glu He Pro He 
140 

Met Glu Gin Val Leu Glu Phe 
155 160 

He Gly Val Tyr Gly Pro Gly 
170 175 

Ser He Asn Asn Glu Leu He 

190 

He Trp Val Gin Met Ser Arg 
205 



Glu Phe Gly Glu Cys Thr He Gin Gin 
210 215 

Leu Ser Trp Asp Glu Lys Glu Thr Gly 
225 230 

Tyr Arg Ala Leu Arg Gin Lys Arg Phe 

245 . 

Trp Glu Glu He Asp Leu Glu Lys Thr 
260 265 

Glu Asn Lys Cys Lys Val Met Phe Thr 
275 280 

Asn Asn Met Gly Ala Glu Tyr Lys Leu 
290 295 

Lys His Ala Trp Glu Leu Phe Cys Ser 
305 310 

Leu Glu Ser Ser Ser He Arg Arg Leu 

325 



Ala Val Gly Ala Arg Leu Gly 
220 

Glu Asn Arg Ala Leu Lys He 
235 240 

Leu Leu Leu Leu Asp Asp Val 
250 255 

Gly Val Pro Arg Pro Asp Arg 

270 

Thr Arg Ser He Ala Leu Cys 
285 

Arg Val Glu Phe Leu Glu Lys 
300 

Lys Val Trp Arg Lys Asp Leu 
315 320 

Ala Glu He He Val Ser Lys 
330 335 



Cys Gly Gly Leu Pro Leu Ala Leu He 
340 345 

His Arg Glu Thr Glu Glu Glu Trp He 
355 360 

Arg Phe Pro Ala Glu Met Lys Gly Met 
370 375 

Lys Phe Ser Tyr Asp Asn Leu Glu Ser 
385 390 

Leu Tyr Cys Ala Leu Phe Pro Glu Glu 

405 

Leu Val Glu Tyr Trp Val Gly Glu Gly 
420 425 



Thr Leu Gly Gly Ala Met Ala 

350 

His Ala Ser Glu Val Leu Thr 
365 

Asn Tyr Val Phe Ala Leu Leu 
380 

Asp Leu Leu Arg Ser Cys Phe 
395 400 

His Ser He Glu He Glu Gin 
410 415 

Phe Leu Thr Ser Ser His Gly 

430 
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Val Asn Thr He Tyr Lys Gly Tyr Phe Leu He Gly Asp Leu Lys Ala 
435 440 445 

Ala Cys Leu Leu Glu Thr Gly Asp Glu Lys Thr Gin Val Lys Met His 
450 455 460 

Asn Val Val Arg Ser Phe Ala Leu Trp Met Ala Ser Glu Gin Gly Thr 
465 470 475 480 

Tyr Lys Glu Leu He Leu Val Glu Pro Ser Met Gly His Thr Glu Ala 

485 490 495 

Pro Lys Ala Glu Asn Trp Arg Gin Ala Leu Val He Ser Leu Leu Asp 
500 505 510 

Asn Arg He Gin Thr Leu Pro Glu Lys Leu He Cys Pro Lys Leu Thr 
515 520 525 

Thr Leu Met Leu Gin Gin Asn Ser Ser Leu Lys Lys He Pro Thr Gly 
530 535 540 

Phe Phe Met His Met Pro Val Leu Arg Val Leu Asp Leu Ser Phe . Thr 
545 550 555 560 

Ser He Thr Glu He Pro Leu Ser He Lys Tyr Leu Val Glu Leu Tyr 

565 570 575 

His Leu Ser Met Ser Gly Thr Lys He Ser Val Leu Pro Gin Glu Leu 
580 565 590 

Gly Asn Leu Arg Lys Leu Lys His Leu Asp Leu Gin Arg Thr Gin Phe 
595 600 605 

Leu Gin Thr He Pro Arg Asp Ala He Cys Trp Leu Ser Lys Leu Glu 
610 615 620 

Val Leu Asn Leu Tyr Tyr Ser Tyr Ala Gly Trp Glu Leu Gin Ser Phe 
625 630 635 640 

Gly Glu Asp Glu Ala Glu Glu Leu Gly Phe Ala Asp Leu Glu Tyr Leu 

645 650 655 

Glu Asn Leu Thr Thr Leu Gly He Thr Val Leu Ser Leu Glu Thr Leu 
660 665 670 

Lys Thr Leu Phe Glu Phe Gly Ala Leu His Lys His He Gin His Leu 
675 680 685 

His Val Glu Glu Cys Asn Glu Leu Leu Tyr Phe Asn Leu Pro Ser Leu 
690 695 700 

Thr Asn His Gly Arg Asn Leu Arg Arg Leu Ser He Lys Ser Cys His 
705 710 715 720 

Asp Leu Glu Tyr Leu Val Thr Pro Ala Asp Phe Glu Asn Asp Trp Leu 

725 730 735 

Pro Ser Leu Glu Val Leu Thr Leu His Ser Leu His Asn Leu Arg Cys 
740 745 750 

He Asn He Ser His Cys Asn Lys Leu Lys Asn Val Ser Trp Val Gin 
755 760 765 
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Lys Leu Pro Lys Leu Glu Val lie Glu Leu Phe Asp Cys Arg Glu lie 
770 775 780 

Glu Glu Leu lie Ser Glu His Glu Ser Pro Ser Val Glu Asp Pro Thr 
785 790 795 800 

Leu Phe Pro Ser Leu Lys Thr Leu Arg Thr Arg Asp Leu Pro Glu Leu 

605 810 815 

Asn Ser lie Leu Pro Ser Arg Phe Ser Phe Gin Lys Val Glu Thr Leu 
820 825 830 

Val lie Thr Asn Cys Pro Arg Val Lys Lys Leu Pro Phe Gin Glu Arg 
835 840 845 

Arg Thr Gin Met Asn Leu Pro Thr Val Tyr Cys Glu Glu Lys Trp Trp 
850 855 860 

Lys Ala Leu Glu Lys Asp Gin Pro Asn Glu Glu Leu Cys Tyr Leu Pro 
865 870 875 880 

Arg Phe Val Pro Asn 

885 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Glu His Ser Val Gin lie Cys Pro Phe He Ser Ser Arg Lys Pro Gly 
15 10 15 

Arg Leu Phe Gin 

20 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Ser His Gin Leu Ser Thr 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Arg Leu Cys Asn His Lys Asn Gin Thr lie Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Ser Lys Arg Lys Ser Glu Lys Ser Ser Lys Trp lie Ser Ser His Leu 
15 10 15 

Leu Ser Leu Ala Val Leu Arg Cys Cys Val Asn Leu 
20 25 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

lie Trp Arg Arg Glu Glu Asp lie Arg Leu lie Leu Asp Lys Pro Ser 
15 10 15 

Leu lie Leu Lys Gin Pro Ser Val Thr 
20 25 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Arg Pro Tyr Val Met Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Leu Tyr Gly Ser Asn Lys Thr Val 
1 5 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Arg Asp Glu Ala Ala Gin He Val Pro Glu Ser Gly Leu Val Arg Cys 
15 10 15 

Lys 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Arg Arg Leu Lys Gin Pro Tyr Phe 
1 5 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 12: 

Gly Leu Gly Val Gly Asn Arg Gly Arg Glu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Gly Gly Asp Thr Ser Val Val Ser Val Val Pro Thr Thr Asn Cys Ala 
1 5 10 15 

Arg Arg Phe Leu Pro Tyr 
20 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
Arg Ala Leu Val Ser 
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1 5 
(2) INFORMATION FOR SEQ ID NO:15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 

Glu Asn Ala Leu Lys Leu Ser Lys Gin Met Ala Gly Gin Phe Lys 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Leu Val Glu Arg Tyr Pro Ser Ser Pro Leu Ser Glu He Pro Arg 
15 10 15 

<2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Trp Asn Arg Phe Trp Asn Phe Ser Val Lys Lys Lys Lys Glu Glu Ser 
15 10 15 

Leu Val Phe Met Asp Leu Val Gly Leu Gly Arg Gin Arg 
20 25 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 amino acids 
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(B) TYPE J amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

Cys Arg Ala Leu Thr Thr Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOtl9: 

Ser Gin Lys Asp lie Ser Met Met Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Phe Gly Phe Lys Cys Pro Glu Asn Ser Ala Ser Val Gin Phe Ser Lys 
1 J 5 10 15 

Pro Leu Glu His Gly Trp Val Tyr Leu Gly Thr Arg Arg Arg Pro Ala 
20 25 30 

Lys Thr Glu Leu 
35 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Arg Tyr Thr Glu Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 22: 

Asp Arg Asn Val Ser Cys Cys Cys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

Met Met Ser Gly Lys Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

Thr Trp Arg Lys Leu Glu Phe Leu Asp Leu Thr Gly Lys Thr Asn Ala 
15 10 15 
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Arg 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 6 amino acids 
(3) TYPE: amino acid 

(C) STRANDED NESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Cys Ser Arg His Gly Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

His Tyr Ala Thr lie Trp Val Arg Asn Thr Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : not relevant 

(D) TOPOLOGY: linear 

' (ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 

Glu Trp Ser Phe Trp Arg Arg Asn Thr Arg Gly Ser Cys Ser Val Val 
15 10 15 

Arg Tyr Gly Glu Lys lie Phe 
20 

(2) INFORMATION FOR SEQ ID NO: 28: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Ser His His Gin Phe Ala Gly Ser Arg Arg Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Val Asn Val Glu Asp Cys His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 

Glu Glu Pro Trp Leu lie Glu Arg Gin Lys Lys Ser Gly Ser Met Leu 
15 10 15 

Val Lys Phe 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



WO 95/28423 PCT/US95/04589 



- 80 - 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 31: 

Leu Asp Phe Gin Gin Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Thr Met Tyr Leu Pro Phe 
1 5 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Asn Ser Ala Thr Thr Thr Ser Arg Val lie Cys Phe Gly Leu Val Ser 
15 10 15 

Cys Thr Ala Leu Tyr Ser Gin Lys Asn lie Leu 
20 25 

(2 J INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Arg Ser Ser Ser Leu Leu Ser Thr Gly Ser Ala Lys Gly Phe Ser Pro 
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15 10 15 

Ala Pro Met Ala Leu Thr Pro Phe Thr Arg Asp lie Phe Ser Leu Gly 
20 25 30 

He 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Lys Arg His Val Cys Trp Lys Pro Glu Met Arg Lys His Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO:36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : not relevant 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Arq Cys He Met Trp Ser Glu Ala Leu His Cys Gly Trp His Leu Asn 
IS 10 15 

Arg Gly Leu He Arg Ser 
20 

(2). INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

Leu Ser Leu Ala Trp Asp He Leu Lys Leu Leu Lys Gin Lys Thr Gly 
15 10 15 
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Asp Lys Arg Trp 
20 • 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NES S : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 

lie Thr Glu Ser Arg Pro Cys Leu Lys Asn Ser Tyr Ala Arg Asn 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Cys Ser Asn Arg Thr Ala Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Arg Arg Phe Gin Gin Gly Phe Ser Cys lie Cys Leu Phe Ser Glu Ser 
15 10 15 

Trp Thr Cys Arg Ser Gin Val Ser Leu Arg Phe Arg Cys Leu Ser Ser 
20 25 30 

lie Trp Trp Ser Cys He He Cys Leu Cys Gin Glu Gin Arg 
35 40 45 
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(2) INFORMATION FOR SEQ ID NO:41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 

Val Tyr Cys His Arg Ser Leu Gly lie Leu Glu Asn 
15 10 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 

Ser lie Trp Thr Tyr Lys Glu Leu Ser Phe Phe Arg Arg Ser His Glu 
15 10 15 

Met Pro Tyr Val Gly 
20 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 5 amino acids 
<B> TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

' (ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

Ala Ser Ser Arg Phe 
1 5 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Thr Cys Thr Thr Val Thr Pro Val Gly Asn Cya Arg Ala Leu Glu Lys 
15 10 15 

Met Lys Gin Lys Asn Ser Asp Ser Leu Thr Trp Asn Thr Trp Lys Thr 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
<B) TYPE: amino acid 
<C) STRANDEDNESS: not relevant 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 

Pro His Ser Val Ser Leu Phe Ser His Trp Arg Pro 
15 10 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPEx amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Lys Leu Ser Ser Ser Ser Val Leu Cys lie Asn lie Tyr Ser lie Ser 
15 10 15 

Thr Leu Lys Ser Ala Met Asn Ser Ser Thr Ser lie Ser His His Ser 
20 25 30 

Leu Thr Met Ala Gly Thr 
35 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

Glu Asp Leu Ala Leu Lys Val Ala Met Thr Trp Ser Thr Trp Ser His 
15 10 15 

Pro Gin lie Leu Lys Met lie Gly Phe Arg Val 
20 25 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION t SEQ ID NO*48: 

Arg Tyr Thr Ala Phe Thr Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



' (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: 

Pro Glu Cys Gly Glu lie Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Ala Lye lie Val Cys Gly He Ser Val Ala 
15 10 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

Thr Phe His Thr Ala Thr Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION s SEQ ID NO: 52: 



Phe Arg Asn Ser Gin Ser 
1 5 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: 

Leu Asn Cys Ser Thr Ala Glu Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 54: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 54: 

Ala Asn Thr Arg Val His Pro Ser Lys lie Gin His Cys Ser Gin Ala 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: 

Glu Leu Gly lie Cys Gin Asn 
1 5 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Thr Ala Ser Ser His Leu Asp Phe His Ser Lys Lys Leu Lys His 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Ser Ser Gin He Ala Pro Glu Leu Arg Asn Cys Arg Phe Arg Arg Gly 
15 10 15 

Gly Pro Arg 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Thr Cys Gin Gin Phe He Val Arg Arg Asn Gly Gly Lys His Trp Lys 
15 10 15 

Lys He Asn Gin Thr Lys Ser Phe Val He Tyr Arg Ala Leu Phe Gin 
20 25 30 

lie Asp He Arg Ala Lys Ser Thr Leu Tyr Lys Tyr Val His Ser 
35 40 45 

<2) INFORMATION FOR SEQ ID NO: 59: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



■ (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: 

Val Ala Gly Ser Gin Glu Gly Cys Ser Ser Glu Val He Asn Phe Pro 
15 10 15 

His Ser His Lys Thr Arg Asp Tyr Val He He Lys Thr Lys Leu Ser 
20 25 30 



Ala 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 



WO 95/28423 



PCT/US95/04589 



- 89 - 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Val Lys Glu Arg Ala Arg Asn His Arg Asn Gly Phe His Leu lie Ser 
15 10 15 

Tyr Arg Trp Leu Cys Ser Gly Val Val 
20 25 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION x SEQ ID NO: 61: 

lie Tyr Glu Tyr Gly Gly Glu Lys Arg Thr 
15 10 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



* (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Leu Glu Gly His Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: 

Pro Asp Phe Thr Asp Pro Thr Arg Arg Ser Arg Gly Thr Lys Leu Leu 
15 10 15 

Lys Ser Cys Gin Arg Val Ala 
20 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Cys Gly Ala Ser Asn Gly Asp 
1 5 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:65: 

Asn Ser Pro Thr Phe Ser Glu Val 
1 5 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

. (C) STRANDEDNESS: not relevant 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 



Ala Ser Gly Thr Glu Asp Ala Asn Glu Glu Glu lie Pro Gin Leu Phe 
15 10 15 



WO 95/28423 



PCT/DS95/04589 



- 91 - 

Arg Leu Cys Arg Leu Gin Thr Val Gin Glu Gly Phe Cys His He Glu 
20 25 30 

Glu His Trp 
35 

(2) INFORMATION FOR SEQ ID NO: 67: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 67: 

Ala Glu Arg Thr Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Ser Tyr Gin Asn Arg Trp Arg Val Asn Ser Ser Asn Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Arg Asp Thr His Gin Val Arg Cys Arg Lys Tyr His Asp Asp Gly Thr 
15 10 15 

Gly Phe Gly He Ser Gin 
20 
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(2) INFORMATION FOR SEQ ID NO: 70; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS; not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Arg Arg Arg Lys Arg Asn His Trp Cys Leu Trp Thr Trp Trp Gly Trp 
15 10 15 

Glu Asp Asn Val Asn Ala Glu His 
20 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

Gin Arg Ala Asp His Lys Arg Thr Ser Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

( D ) TOPOLOGY : 1 inear 

• (ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Cys Thr Asp Leu Gly Ser Asn Val Gin Arg lie Arg Arg Val Tyr Asn 
1 5 10 15 

Ser Ala Ser Arg Trp Ser Thr Val Gly Phe lie Leu Gly Arg Glu Gly 
20 25 30 

Asp Arg Arg Lys Gin Ser Phe Glu Asp lie Gin Ser Phe Glu Thr Glu 
35 40 45 
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Thr Phe Leu Val Val Ala Arg 
50 55 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Cys Leu Gly Arg Asp Arg Leu Gly Glu Asn Trp Ser Ser Ser Thr 
IS 10 15 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Arg Asp Arg Arg Arg Val Asp Pro Cys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

' (ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Gin Gly Lys Gin Met Gin Gly Asp Val His Asp Thr Val Tyr Ser lie 
15 10 15 

Met Gin Gin Tyr Gly Cys Gly lie Gin Val Glu Ser Gly Val Ser Gly 
20 25 30 

Glu Glu Thr Arg Val Gly Ala Val Leu 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Gly Met Glu Lys Arg Ser Phe Arg Val lie lie Asn Ser Pro Ala Arg 
15 10 15 

Gly Asp Tyr Ser Glu 
20 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Met Trp Arg lie Ala Thr Ser Val Asp His Phe Arg Arg Ser His Gly 
15 10 15 

Ser 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

lie Ser Ser Arg Asp Glu Gly Tyr Glu Leu Cys lie Cys Pro Phe Glu 
15 10 15 

He Gin Leu Arg Gin Pro Arg Glu 
20 



WO 95/28423 



PCT/US95/04589 



- 95 - 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Ser Ala Ser Val Leu Phe Leu Val Leu Arg Phe lie Pro Arg Arg Thr 
15 10 15 

Phe Tyr Arg Asp Arg Ala Ala Cys 
20 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Val Leu Gly Arg Arg Arg Val Ser His Gin Leu Pro Trp Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

' (ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

His His Leu Gin Gly lie Phe Ser His Trp Gly Ser Glu Ser Gly Met 
15 10 15 

Phe Val Gly Asn Arg Arg 
20 

(2) INFORMATION FOR SEQ ID NO: 82: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:82: 

Glu Aan Thr Gly Glu Asp Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83 : 

Lys Thr His Met Pro Glu Thr Asp Asn Thr Asp Ala Pro Thr Glu Gly 
1 5 10 .15 

Leu Phe Glu Glu Asp Ser Asn Arg Val Phe His Ala Tyr Ala Cys Ser 
20 25 30 

Gin Ser Leu Gly Leu Val Val His Lys Tyr His 
35 40 

(2) INFORMATION FOR SEQ ID NO:84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

Cys Gly Gin Lys Leu Cys He Val Asp Gly He 
15 10 

(2) INFORMATION FOR SEQ ID NO:85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Gly Ala Asp Pro Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:86: 

Ser Arq Lys Leu Ala Thr Ser Val Gly Asp Leu lie Val Arg 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:87: 

Gin Asn Pro Asp Leu Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:88: 
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Asp Ser Val Val Tyr Gin Val Phe Gly Gly Val Val Ser Ser Val Tyr 
1 5 10 15 . 

Val Arg Asn Ly9 Asp Lys Cys lie Ala Thr Gly Ala Trp Glu Ser 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:89: 

Lys Thr Glu Ala Ser Gly Pro Thr Lys Asn Ser Val Ser Ser Asp Asp 
15 10 15 

Pro Thr Arg Cys His Met Leu Ala Glu Gin Ala Arg Gly Ser Glu Leu 
20 25 30 

Val Leu Gin Leu Arg Arg Leu Gly Thr Ala Glu Leu Trp Arg Arg 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID* NO: 90: 

Ser Arg Arg Thr Arg lie Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
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Leu Gly He Leu Gly Lys Pro Asn His Thr Arg Tyr His Cys Ser Leu 
1 5 10 15 

He Gly Asp Pro Lys Asn Ser Leu Arg Val Arg Cys Phe Ala 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:92: 

Thr Tyr Thr Ala Ser Pro Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS) 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:93: 

Thr Pro Leu Leu Gin Ser Pro He Thr His 
15 10 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Pro Trp Gin Glu Pro Glu Lys Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:95: 

Leu Gly Val Pro Gly His Thr Arg Arg Phe 
15 10 

(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Leu Ala Ser Glu Ser Arg Gly Ser Asp Val Thr Gin Pro Ser Gin Leu 
1 5 10 15 

Asn Gin Ser Val Gly Lys Phe Cys Lys Pro Arg Leu Ser Ala Glu Tyr 
20 25 30 

Pro Leu His Lys His Phe Thr Leu Gin Gin Ala Glu Glu Cys Leu Met 
35 40 45 

Gly Ser Glu Thr Pro Lys Ala Arg Gly Asp 
50 55 

(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:97: 

Thr Val Arg Leu Gin Arg Asp Arg Gly He Asp Lys Arg Thr Arg Glu 
1 5 10 15 

Ser He Arg Arg Arg Ser Asn He Val Pro Lys Pro Glu Asp Leu Glu 
20 25 30 
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Asn 

(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Gly Ser Ala Arg Thr Lys Gin His Pro Pro lie Ser lie Phe lie Pro 
15 10 15 

Lys Ser 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH] 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS t not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:99: 

Asn lie Ser His His Lys Leu Pro Gin Ser 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Glu Thr Ala Val Ser Gly Glu Glu Asp Pro Asp Glu Leu Ala Asn Ser 
15 10 15 

Leu Leu 
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(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

Thr Ser Hia His 
1 

(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102* 

Glu Leu Arq Ala Leu Cys Thr Asn Met Ser lie His Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

Gin Glu Ala Arg Lys Val Val Pro Val Lys Ser Ser Thr Phe His lie 
15 10 15 

Ala Thr Lys Leu Glu lie Met 
20 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Lys Pro Asn Tyr Pro Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1491 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:105: 



ATCGATTGAT 


CTCTGGCTCA 


GTGCGAGTAG 


TCCATTTGAG 


AGCAGTCGTA 


GCCCCGCGTG 


60 


GCGCATCATG 


GAGCTATTTG 


GAATTTTCGC 


AGGGTTATCG 


ATTCGTAGTG 


GGAACCCATT 


120 


CATTGTTTGG 


AACCACCAAC 


GGACGACTTA 


ACAAGCTCCC 


CGAGGTGCAT 


GATGAAAATT 


180 


GCTCCAGTTG 


CCATAAATCA 


CAGCCCGCTC 


AGCAGGGAGG 


TCCCGTCACA 


CGCGGCACCC 


240 


ACTCAGGCAA 


AGCAAACCAA 


CCTTCAATCT 


GAAGCTGGCG 


ATTTAGATGC 


AAGAAAAAGT 


300 


AGCGCTTCAA 


GCCCGGAAAC 


CCGCGCATTA 


CTCGCTACTA 


AGACAGTACT 


CGGGAGACAC 


360 


AAGATAGAGG 


TTCCGGCCTT 


TGGAGGGTGG 


TTCAAAAAGA 


AATCATCTAA 


GCACGAGACG 


420 


GGCGGTTCAA 


GTGCCAACGC 


AGATAGTTCG 


AGCGTGGCTT 


CCGATTCCAC 


CGAAAAACCT 


480 


TTGTTCCGTC 


TCACGCACGT 


TCCTTACGTA 


TCCCAAGGTA 


ATGAGCGAAT 


GGGATGTTGG 


540 


TATGCCTGCG 


CAAGAATGGT 


TGGCCATTCT 


GTCGAAGCTG 


GGCCTCGCCT 


AGGGCTGCCG 


600 


GAG CTCTATG 


AGGGAAGGGA 


GGCGCCAGCT 


GGGCTACAAG 


ATTTTTCAGA 


TGTAGAAAGG 


660 


TTTATTCACA 


ATGAAGGATT 


AACTCGGGTA 


GACCTTCCAG 


ACAATGAGAG 


ATTTACACAC 


720 


GAAGAGTTGG 


GTGCACTGTT 


GTATAAGCAC 


GGGCCGATTA 


TATTTGGGTG 


GAAAACTCCG 


780 


AATGACAGCT 


GGCACATGTC 


GGTCCTCACT 


GGTGTCGATA 


AAGAGACGTC 


GTCCATTACT 


840 


TTTCACGATC 


CCCGACAGGG 


GCCGGACCTA 


GCAATGCCGC 


TCGATTACTT 


TAATCAGCGA 


900 


TTGGCATGGC 


AGGTTCCACA 


CGCAATGCTC 


TACCGCTAAG 


TAGCAGGGTA 


TCTTCACGTG 


960 


GCGGCATCAT 


GACAAGCCCA 


TGATGCCGCC 


AGCAGCTACC 


TGAATGCCGT 


CTGGCTTTTT 


1020 


GGTCCCTATT 


GTCGTATCCG 


GAAGATGACG 


TCAAAGAATC 


TCGGCAAGAG 


CTTTCTTGCT 


1080 
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CGACTCCTCA GCTTCCGGAT CGATCAGGTC GCTTGCCAGA GCGCGCTTGT CCATGAGCAT 1140 
CTGCCACAGC TGCTGGTCGA TGGTGTCCTC AGCTAAAGGG ATTTTGACGA CAACCATGCG " 1200 

CAACTGCCCG TTGCGATACG CTCGATCCTG AAGCCCCGGT GTCCATGGCA GCCCCAAGAA 1260 

AAAGACATAG TTCGCCGCTG TGAGGTTGTA GCCTGTGCCG GCGGCCGACC TGGTCCCGAT 1320 

AAACACCCTG CAGTCCGGAT CCTGCTGGAA AGCATCAATC GCCTTCTGCC GCTTCTTGGG 1380 

CGAGTCACTG CCCACCAACG TCACGCACCC GACGCCAAGC TTGAGGCAGT GCTCCCGCAA 1440 

CGTGGCCACG GATTCCTGAT ACTCGCAGAA GAGGATCACC TTGTCGTCGA C 1491 
(2) INFORMATION FOR SEQ ID NO: 106; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 255 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY x linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 106: 

Met Lys lie Ala Pro Val Ala lie Asn His Ser Pro Leu Ser Arg Glu 
15 10 15 

Val Pro Ser His Ala Ala Pro Thr Gin Ala Lys Gin Thr Asn Leu Gin 
20 25 30 

Ser Glu Ala Gly Asp Leu Asp Ala Arg Lys Ser Ser Ala Ser Ser Pro 
35 40 45 

Glu Thr Arg Ala Leu Leu Ala Thr Lys Thr Val Leu Gly Arg His Lys 
50 55 60 

lie Glu Val Pro Ala Phe Gly Gly Trp Phe Lys Lys Lys Ser Ser Lys 
65 70 75 80 

His Glu Thr Gly Gly Ser Ser Ala Asn Ala Asp Ser Ser Ser Val Ala 

85 90 95 

Ser Asp Ser Thr Glu Lys Pro Leu Phe Arg Leu Thr His Val Pro Tyr 
100 105 110 

Val Ser Gin Gly Asn Glu Arg Met Gly Cys Trp Tyr Ala Cys Ala Arg 
115 120 125 

Met Val Gly His Ser Val Glu Ala Gly Pro Arg Leu Gly Leu Pro Glu 
130 135 140 

Leu Tyr Glu Gly Arg Glu Ala Pro Ala Gly Leu Gin Asp Phe Ser Asp 
145 150 155 160 

Val Glu Arg Phe lie His Asn Glu Gly Leu Thr Arg Val Asp Leu Pro 

165 170 175 

Asp Asn Glu Arg Phe Thr His Glu Glu Leu Gly Ala Leu Leu Tyr Lys 
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180 185 190 

His Gly Pro lie He Phe Gly Trp Lys Thr Pro Asn Asp Ser Trp His 
195 200 205 

Met Ser Val Leu Thr Gly Val Asp Lys Glu Thr Ser Ser He Thr Phe 
210 215 220 

His Asp Pro Arg Gin Gly Pro A9p Leu Ala Met Pro Leu Asp Tyr Phe 
225 230 235 240 

Asn Gin Arg Leu Ala Trp Gin Val Pro His Ala Met Leu Tyr Arg 

245 250 255 

(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1209 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Met Asn Pro Ser Gly Ser Phe Pro Ser Val Glu Tyr Glu Val Phe Leu 
1 5 10 15 

Ser Phe Arg Gly Pro Asp Thr Arg Glu Gin Phe Thr Asp Phe Leu Tyr 
20 25 30 

Gin Ser Leu Arg Arg Tyr Lys He His Thr Phe Arg Asp Asp Asp Glu 
35 40 45 

Leu Leu Lys Gly Lys Glu He Gly Pro Asn Leu Leu Arg Ala He Asp 
50 55 60 

Gin Ser Lys He Tyr Val Pro He He Ser Ser Gly Tyr Ala Asp Ser 
65 70 75 80 

Lys Trp Cys Leu Met Glu Leu Ala Glu He Val Arg Arg Gin Glu Glu 

85 90 95 

Asp Pro Arg Arg He He Leu Pro He Phe Tyr Met Val Asp Pro Ser 
100 105 110 

Asp Val Arg His Gin Thr Gly Cys Tyr Lys Lys Ala Phe Arg Lys His 
115 120 125 

Ala Asn Lys Phe Asp Gly Gin Thr He Gin Asn Trp Lys Asp Ala Leu 
130 135 140 

Lys Lys Val Gly Asp Leu Lys Gly Trp His He Gly Lys Asn Asp Lys 
145 150 155 160 

Gin Gly Ala He Ala Asp Lys Val Ser Ala Asp He Trp Ser His He 

165 170 175 
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Ser Lys Glu Asn Leu lie Leu Glu 
180 

Asp His lie Thr Ala Val Leu Glu 
195 200 

Val Thr Met Val Gly Leu Tyr Gly 
210 215 

Thr Ala Lys Ala Val Tyr Asn Lys 
225 230 

Cys Phe lie Asp Asn lie Arg Glu 

245 

Val Leu Gin Lys Lys Leu Val Ser 
260 

Ser Val Gly Phe Asn Asn Asp Ser 
275 280 

Arg Val Ser Arg Phe Lys He Leu 
290 295 

Lys Phe Lys Phe Glu Asp Met Leu 
305 310 

Gin Ser Arg Phe He He Thr Ser 

325 

Leu Asn Glu Asn Gin Cys Lys Leu 
340 

Pro Arg Ser Leu Glu Leu Phe Ser 
355 360 

Pro Pro Ser Ser Tyr Tyr Glu Thr 
370 375 

Thr Ala Gly Leu Pro Leu Thr Leu 
385 390 

Lys Gin Glu He Ala Val Trp Glu 

405 

Thr Leu Asn Leu Asp Glu Val Tyr 
420 

Ala Leu Asn Pro Glu Ala Lys Glu 
435 440 

Phe He Gly Gin Asn Lys Glu Glu 
450 455 

Asn Phe Tyr Pro Ala Ser Asn He 
465 470 

He Gin Val Gly Asp Asp Asp Glu 

485 



Thr Asp Glu Leu Val Gly He Asp 
185 190 

Lys Leu Ser Leu Asp Ser Glu Asn 

205 

Met Gly Gly He Gly Lys Thr Thr 
220 

He Ser Ser Cys Phe Asp Cys Cys 
235 240 

Thr Gin Glu Lys Asp Gly Val Val 
250 255 

Glu He Leu Arg He Asp Ser Gly 
265 270 

Gly Gly Arg Lys Thr He Lys Glu 

285 

Val Val Leu Asp Asp Val Asp Glu 
300 

Gly Ser Pro Lys Asp Phe He Ser 
315 320 

Arg Ser Met Arg Val Leu Gly Thr 
330 335 

Tyr Glu Val Gly Ser Met Ser Lys 
345 350 

Lys His Ala Phe Lys Lys Asn Thr 

365 

Leu Ala Asn Asp Val Val Asp Thr 
380 

Lys Val He Gly Ser Leu Leu Phe 

395 400 

Asp Thr Leu Glu Gin Leu Arg Arg 
410 415 

Asp Arg Leu Lys He Ser Tyr Asp 
425 430 

He Phe Leu Asp He Ala Cys Phe 

445 

Pro Tyr Tyr Met Trp Thr Asp Cys 
460 

He Phe Leu He Gin Arg Cys Met 
475 480 

Phe Lys Met His Asp Gin Leu Arg 
490 495 



Asp Met Gly Arg Glu He Val Arg 
500 



Arg Glu Asp Val Leu Pro Trp Lys 
505 510 
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Ser Arg lie Trp Ser Ala Glu Glu Gly lie Asp Leu Leu Leu Asn Lys 
515 520 525 

Arg Lys Gly Ser Ser Lys Val Lys Ala lie Ser lie Pro Trp Gly Val 
530 535 540 

Lys Tyr Glu Phe Lys Ser Glu Cys Phe Leu Asn Leu Ser Glu Leu Arg 
545 550 555 560 

Tyr Leu His Ala Arg Glu Ala Met Leu Thr Gly Asp Phe Asn Asn Leu 

565 570 575 

Leu Pro Asn Leu Lys Trp Leu Glu Leu Pro Phe Tyr Lys His Gly Glu 
580 585 590 

Asp Asp Pro Pro Leu Thr Asn Tyr Thr Met Lys Asn Leu lie lie Val 
595 600 605 

lie Leu Glu His Ser His lie Thr Ala Asp Asp Trp Gly Gly Trp Arg 
610 615 620 

His Met Met Lys Met Ala Glu Arg Leu Lys Val Val Arg Leu Ala Ser 
625 630 635 640 

Asn Tyr Ser Leu Tyr Gly Arg Arg Val Arg Leu Ser Asp Cys Trp Arg 

645 650 655 

Phe Pro Lys Ser He Glu Val Leu Ser Met Thr Ala He Glu Met Asp 
660 665 670 

Glu Val Asp He Gly Glu Leu Lys Lys Leu Lys Thr Leu Val Leu Lys 
675 680 685 

Pro Cys Pro He Gin Lys He Ser Gly Gly Thr Phe Gly Met Leu Lys 
690 695 700 

Gly Leu Arg Glu Leu Cys Leu Glu Phe Asn Trp Gly Thr Asn Leu Arg 
705 710 715 720 

Glu Val Val Ala Asp He Gly Gin Leu Ser Ser Leu Lys Val Leu Lys 

725 730 735 

Thr Gly Ala Lys Glu Val Glu He Asn Glu Phe Pro Leu Gly Leu Lys 
740 745 750 

Thr Glu Leu Ser Thr Ser Ser Arg He Pro Asn Asn Leu Ser Gin Leu 
755 760 765 

Leu Asp Leu Glu Val Leu Lys Val Tyr Asp Cys Lys Asp Gly Phe Asp 
770 775 780 

Met Pro Pro Ala Ser Pro Ser Glu Asp Glu Ser Ser Val Trp Trp Lys 
785 790 795 800 

Val Ser Lys Leu Lys Ser Leu Gin Leu Glu Lys Thr Arg He Asn Val 

805 810 815 

Asn Val Val Asp Asp Ala Ser Ser Gly Gly His Leu Pro Arg Tyr Leu 
820 825 830 

Leu Pro Thr Ser Leu Thr Tyr Leu Lys He Tyr Gin Cys Thr Glu Pro 
835 840 845 
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Thr Trp Leu Pro Gly He Glu Ash Leu Glu Asn Leu Thr Ser Leu Glu 
850 855 860 

Val Asn Asp He Phe Gin Thr Leu Gly Gly Asp Leu Asp Gly Leu Gin 
865 S70 875 880 

Gly Leu Arg Ser Leu Glu He Leu Arg He Arg Lys Val Asn Gly Leu 

885 890 695 

Ala Arg He Lys Gly Leu Lys Asp Leu Leu Cys Ser Ser Thr Cys Lys 
900 905 910 

Leu Arg Lys Phe Tyr He Thr Glu Cys Pro Asp* Leu He Glu Leu Leu 
915 920 925 

Pro Cys Glu Leu Gly Val Gin Thr Val Val Val Pro Ser Met Ala Glu 
930 935 940 

Leu Thr He Arg Asp Cys Pro Arg Leu Glu Val Gly Pro Met He Arg 
945 950 955 960 

Ser Leu Pro Lys Phe Pro Met Leu Lys Lys Leu Asp Leu Ala Val Ala 

965 970 975 

Asn He Thr Lys Glu Glu Asp Leu Asp Ala He Gly Ser Leu Glu Glu 
980 985 990 

Leu Val Ser Leu Glu Leu Glu Leu Asp Asp Thr Ser Ser Gly He Glu 
995 1000 1005 

Arg He Val Ser Ser Ser Lys Leu Gin Lys Leu Thr Thr Leu Val Val 
1010 1015 1020 

Lys Val Pro Ser Leu Arg Glu He Glu Gly Leu Glu Glu Leu Lys Ser 
1025 1030 1035 1040 

Leu Gin Asp Leu Tyr Leu Glu Gly Cys Thr Ser Leu Gly Arg Leu Pro 

1045 1050 1055 

Leu Glu Lys Leu Lys Glu Leu Asp He Gly Gly Cys Pro Asp Leu Thr 
1060 1065 1070 

Glu Leu Val Gin Thr Val Val Ala Val Pro Ser Leu Arg Gly Leu Thr 
1075 1080 1085 

He Arg Asp Cys Pro Arg Leu Glu Val Gly Pro Met He Gin Ser Leu 
1090 1095 1100 

Pro Lys Phe Pro Met Leu Asn Glu Leu Thr Leu Ser Met Val Asn He 
1105 HIO 1115 1120 

Thr Lys Glu Asp Glu Leu Glu Val Leu Gly Ser Leu Glu Glu Leu Asp 

1125 1130 1135 

Ser Leu Glu Leu Thr Leu Asp Asp Thr Cys Ser Ser He Glu Arg He 
1140 H45 1150 

Ser Phe Leu Ser Lys Leu Gin Lys Leu Thr Thr Leu He Val Glu Val 
1155 H60 1165 

Pro Ser Leu Arg Glu He Glu Gly Leu Ala Glu Leu Lys Ser Leu Arg 
1170 H75 1180 
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He Leu Tyr Leu Glu Gly Cys Thr Ser Leu GXu Arg Leu Trp Pro Asp 
1185 1190 1195 12Q0 

Gin Gin Gin Leu Gly Ser Leu Lys Asn 

1205 

(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1143 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

Met Ala Ser Ser Ser Ser Ser Ser Arg Trp Ser Tyr Asp Val Phe Leu 
15 10 15 

Ser Phe Arg Gly Glu Asp Thr Arg Lys Thr Phe Thr Ser His Leu Tyr 
20 25 30 

Glu Val Leu Asn Asp Lys Gly He Lys Thr Phe Gin Asp Asp Lys Arg 
35 40 45 

Leu Glu Tyr Gly Ala Thr He Pro Gly Glu Leu Cys Lys Ala He Glu 
50 55 60 

Glu Ser Gin Phe Ala He Val Val Phe Ser Glu Asn Tyr Ala Thr Ser 
65 70 75 80 

Arg Trp Cys Leu Asn Glu Leu Val Lys He Met Glu Cys Lys Thr Arg 

85 90 95 

Phe Lys Gin Thr Val He Pro He Phe Tyr Asp Val Asp Pro Ser His 
100 105 110 

Val Arg Asn Gin Lys Glu Ser Phe Ala Lys Ala Phe Glu Glu His Glu 
115 120 125 

Thr Lys Tyr Lys Asp Asp Val Glu Gly lie Gin Arg Trp Arg He Ala 
130 135 140 

Leu Asn Glu Ala Ala Asn Leu Lys Gly Ser Cys Asp Asn Arg Asp Lys 
145 150 155 160 

Thr Asp Ala Asp Cys He Arg Gin He Val Asp Gin He Ser Ser Lys 

165 170 175 

Leu Cys Lys He Ser Leu Ser Tyr Leu Gin Asn He Val Gly He Asp 
180 185 190 

Thr His Leu Glu Lys He Glu Ser Leu Leu Glu He Gly He Asn Gly 
195 200 205 

Val Arg He Met Gly He Trp Gly Met Gly Gly Val Gly Lys Thr Thr 
210 215 220 

He Ala Arg Ala He Phe Asp Thr Leu Leu Gly Arg Met Asp Ser Ser 
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225 230 235 240 

Tyr Gin Phe Asp Gly Ala Cys Phe Leu Lys Asp lie Lys Glu Asn Lys 

245 250 255 

Arg Gly Met: His Ser Leu Gin Asn Ala Leu Leu Ser Glu Leu Leu Arg 
260 265 270 

Glu Lys Ala Asn Tyr Asn Asn Glu Glu Asp Gly Lys His Gin Met Ala 
275 280 285 

Ser Arg Leu Arg Ser Lys Lys Val Leu lie Val Leu Asp Asp lie Asp 
290 295 300 

Asn Lys Asp His Tyr Leu Glu Tyr Leu Ala Gly Asp Leu Asp Trp Phe 
305 310 315 320 

Gly Asn Gly Ser Arg He He He Thr Thr Arg Asp Lys His Leu He 

325 330 335 

Glu Lys Asn Asp He He Tyr Glu Val Thr Ala Leu Pro Asp His Glu 
340 345 350 

Ser He Gin Leu Phe Lys Gin His Ala Phe Gly Lys Glu Val Pro Asn 
355 360 365 

Glu Asn Phe Glu Lys Leu Ser Leu Glu Val Val Asn Tyr Ala Lys Gly 
370 375 380 

Leu Pro Leu Ala Leu Lys Val Trp Gly Ser Leu Leu His Asn Leu Arg 
385 390 395 400 

Leu Thr Glu Trp Lys Ser Ala He Glu His Met Lys Asn Asn Ser Tyr 

405 .410 415 

Ser Gly He He Asp Lys Leu Lys He Ser Tyr Asp Gly Leu Glu Pro 
420 425 430 

Lys Gin Gin Glu Met Phe Leu Asp He Ala Cys Phe Leu Arg Gly Glu 
435 440 445 

Glu Lys Asp Tyr He Leu Gin He Leu Glu Ser Cys His He Gly Ala 
450 455 460 

Glu Tyr Gly Leu Arg He Leu He Asp Lys Ser Leu Val Phe He Ser 
465 470 475 480 

Glu Tyr Asn Gin Val Gin Met His Asp Leu He Gin Asp Met Gly Lys 

485 490 495 

Tyr He Val Asn Phe Gin Lys Asp Pro Gly Glu Arg Ser Arg Leu Trp 
500 505 510 

Leu Ala Lys Glu Val Glu Glu Val Met Ser Asn Asn Thr Gly Thr Met 
515 520 525 

Ala Met Glu Ala He Trp Val Ser Ser Tyr Ser Ser Thr Leu Arg Phe 
530 535 540 

Ser Asn Gin Ala Val Lys Asn Met Lys Arg Leu Arg Val Phe Asn Met 
545 550 555 560 

Gly Arg Ser Ser Thr His Tyr Ala He Asp Tyr Leu Pro Asn Asn Leu 
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565 



570 



575 



Arg Cys Phe Val Cys Thr Asn Tyr Pro Trp Glu Ser Phe Pro Ser fhr 
580 585 590 

Phe Glu Leu Lys Met Leu Val His Leu Gin Leu Arg His Asn Ser Leu 
595 600 605 

Arg His Leu Trp Thr Glu Thr Lys His Leu Pro Ser Leu Arg Arg lie 
610 615 620 

Asp Leu Ser Trp Ser Lys Arg Leu Thr Arg Thr Pro Asp Phe Thr Gly 
625 630 635 640 

Met Pro Asn Leu Glu Tyr Val Asn Leu Tyr Gin Cys Ser Asn Leu Glu 

645 650 655 

Glu Val His His Ser Leu Gly Cys Cys Ser Lys Val He Gly Leu Tyr 
660 665 670 

Leu Asn Asp Cys Lys Ser Leu Lys Arg Phe Pro Cys Val Asn Val Glu 
675 680 685 

Ser Leu Glu Tyr Leu Gly Leu Arg Ser Cys Asp Ser Leu Glu Lys Leu 
690 695 700 

Pro Glu lie Tyr Gly Arg Met Lys Pro Glu lie Gin lie His Met Gin 
705 710 715 720 

Gly Ser Gly lie Arg Glu Leu Pro Ser Ser lie Phe Gin Tyr Lys Thr 

725 730 735 

His Val Thr Lys Leu Leu Leu Trp Asn Met Lys Asn Leu Val Ala Leu 
740 745 750 



Pro Ser Ser He Cys Arg Leu Lys Ser Leu Val Ser Leu Ser Val Ser 
755 760 765 

Gly Cys Ser Lys Leu Glu Ser Leu Pro Glu Glu He Gly Asp Leu Asp 
770 775 780 

Asn Leu Arg Val Phe Asp Ala Ser Asp Thr Leu He Leu Arg Pro Pro 
785 790 795 800 

Ser Ser He He Arg Leu Asn Lys Leu He He Leu Met Phe Arg Gly 

605 810 815 

Phe Lys Asp Gly Val His Phe Glu Phe Pro Pro Val Ala Glu Gly Leu 
820 825 830 

His Ser Leu Glu Tyr Leu Asn Leu Ser Tyr Cys Asn Leu He Asp Gly 
835 840 845 

Gly Leu Pro Glu Glu He Gly Ser Leu Ser Ser Leu Lys Lys Leu Asp 
650 855 860 

Leu Ser Arg Asn Asn Phe Glu His Leu Pro Ser Ser He Ala Gin Leu 
865 870 875 880 

Gly Ala Leu Gin Ser Leu Asp Leu Lys Asp Cys Gin Arg Leu Thr Gin 

885 890 895 

Leu Pro Glu Leu Pro Pro Glu Leu Asn Glu Leu His Val Asp Cys His 
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900 905 910 

Met Ala Leu Lys Phe lie His Tyr Leu Val Thr Lys Arg Lys Lys Leu 
915 920 925 

His Arg Val Lys Leu Asp Asp Ala His Asn Asp Thr Met Tyr Asn Leu 
930 935 940 

Phe Ala Tyr Thr Met Phe Gin Asn He Ser Ser Met Arg His Asp He 
945 950 955 960 

Ser Ala Ser Asp Ser Leu Ser Leu Thr Val Phe Thr Gly Gin Pro Tyr 

965 970 975 

Pro Glu Lys He Pro Ser Trp Phe His His Gin Gly Trp Asp Ser Ser 
980 985 990 

Val Ser Val Asn Leu Pro Glu Asn Trp Tyr He Pro Asp Lys Phe Leu 
995 1000 1005 

Gly Phe Ala Val Cys Tyr Ser Arg Ser Leu He Asp Thr Thr Ala His 
1010 1015 1020 

Leu He Pro Val Cys Asp Asp Lys Met Ser Arg Met Thr Gin Lys Leu 
1025 1030 1035 1040 

Ala Leu Ser Glu Cys Asp Thr Glu Ser Ser Asn Tyr Ser Glu Trp Asp 

1045 1050 1055 

He His Phe Phe Phe Val Pro Phe Ala Gly Leu Trp Asp Thr Ser Lys 
1060 1065 1070 

Ala Asn Gly Lys Thr Pro Asn Asp Tyr Gly He He Arg Leu Ser Phe 
1075 1080 1085 

Ser Gly Glu Glu Lys Met Tyr Gly Arg Leu Arg Leu Tyr Lys Glu Gly 
1090 1095 1100 

Pro Glu Val Asn Ala Leu Leu Gin Met Arg Glu Asn Ser Asn Glu Pro 
1105 1110 1115 1120 

Thr Glu His Ser Thr Gly He Arg Arg Thr Gin Tyr Asn Asn Arg Thr 

1125 1130 1135 

Ser Phe Tyr Glu Leu He Asn 
1140 

(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 429 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Leu Arq Ser Lys Leu Asp Leu He He Asp Leu Lys His Gin He Glu 
15 10 15 
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Ser Val Lys Glu Gly Leu Leu Cys Leu Arg Ser Phe lie Asp His Phe 
20 25 30 

Ser Glu Ser Tyr Val Glu His Asp Glu Ala Cys Gly Leu lie Ala Arg 
35 40 45 

Val Ser Val Met Ala Tyr Lys Ala Glu Tyr Val lie Asp Ser Cys Leu 
50 55 60 

Ala Tyr Ser His Pro Leu Trp Tyr Lys Val Leu Trp lie Ser Glu Val 
65 70 75 80 

Leu Glu Asn lie Lys Leu Val Asn Lys Val Val Gly Glu Thr Cys Glu 

65 90 95 

Arg Arg Asn Thr Glu Val Thr Val His Glu Val Ala Lys Thr Thr Thr 
100 105 110 

Asn Val Ala Pro Ser Phe Ser Ala Tyr Thr Gin Arg Ala Asn Glu Glu 
115 120 125 

Met: Glu Gly Phe Gin Asp Thr lie Asp Glu Leu Lys Asp Lys Leu Leu 
130 135 140 

Gly Gly Ser Pro Glu Leu Asp Val lie Ser He Val Gly Met Pro Gly 
145 150 155 160 

Leu Gly Lys Thr Thr Leu Ala Lys Lys He Tyr Asn Asp Pro Glu Val 

165 170 175 

Thr Ser Arg Phe Asp Val His Ala Gin Cys Val Val Thr Gin Leu Tyr 
180 185 190 

Ser Trp Arg Glu Leu Leu Leu Thr He Leu Asn Asp Val Leu Glu Pro 
195 200 205 

Ser Asp Arg Asn Glu Lys Glu Asp Gly Glu He Ala Asp Glu Leu Arg 
210 215 220 

Arg Phe Leu Leu Thr Lys Arg Phe Leu He Leu He Asp Asp Val Trp 
225 230 235 240 

Asp Tyr Lys Val Trp Asp Asn Leu Cys Met Cys Phe Ser Asp Val Ser 

245 250 255 

Asn Arg Ser Arg He He Leu Thr Thr Arg Leu Asn Asp Val Ala Glu 
260 265 270 

Tyr Val Lys Cys Glu Ser Asp Pro His His Leu Arg Leu Phe Arg Asp 
275 280 285 

Asp Glu Ser Trp Thr Leu Leu Gin Lys Glu Val Phe Gin Gly Glu Ser 
290 295 300 

Cys Pro Pro Glu Leu Glu Asp Val Gly Phe Glu He Ser Lys Ser Cys 
305 310 315 320 

Arg Gly Leu Pro Leu Ser Val Val Leu Val Ala Gly Val Leu Lys Gin 

325 330 335 



Lys Lys Lys Thr Leu Asp Ser Trp Lys Val Val Glu Gin Ser Leu Ser 
340 345 350 
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Ser Gin Arg He Gly Ser Leu Glu Glu Ser He Ser He He Gly Phe 
355 360 365 

Ser Tyr Lys Asn Leu Pro His Tyr Leu Lys Pro Cys Phe Leu Tyr Phe 
370 375 380 

Gly Gly Phe Leu Gin Gly Lys Asp He His Asp Ser Lys Met Thr Lys 
385 390 395 400 

Leu Trp Val Ala Glu Glu Phe Val Gin Ala Asn Asn Glu Lys Gly Gin 

405 410 415 

Glu Asp Thr Arg Thr Arg Phe Leu Gly Arg Ser Tyr Trp 
420 425 

(2) INFORMATION FOR SEQ ID NO: 110 1 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Gly Met Gly Gly He Gly Lys Thr Thr Thr Ala 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111; 

Gly Met Gly Gly Val Gly Lys Thr Thr He Ala 
15 10 

(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:112: 
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Gly Met Pro Gly Leu Gly Lys Thr Thr Leu Ala 
15 10 

*<2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

Gly Pro Gly Gly Val Gly Lys Thr Thr Leu Met 
15 10 

(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

Phe Lys lie Leu Val Val Leu Asp Asp Val Asp 
15 10 

(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:115: 

Lys Lys Val Leu He Val Leu Asp Asp lie Asp 
15 10 

(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

Lys Arg Phe Leu lie Leu lie Asp Asp Val Trp 
15 10 

(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 11 amino acids 

(B) TYPE; amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

Lys Arg Phe Leu Leu Leu Leu Asp Asp Val Trp 
1 5 10 

(2) INFORMATION FOR SEQ ID NO J 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: S amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Ser Arg Phe He He Thr Ser Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

Ser Arg He He He Thr Thr Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
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(D) TOPOLOGY j linear 
(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION : SEQ ID NO: 120: 

Ser Arg lie lie Leu Thr Thr Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO:121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Thr Thr Arg 
1 

(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:122: 

Gly Leu Pro Leu Thr Leu Lys Val 
1 5 

(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

Gly Leu Pro Leu Ala Leu Lys Val 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

-\ 

Gly Leu Pro Leu Ser Val Val Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

Gly Leu Pro Leu Ala Leu lie Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

Lys lie Ser Tyr Asp Ala Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Lys He Ser Tyr Asp Gly Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 128: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

Gly Phe Ser Tyr Lys Asn Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 12 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

Val Phe Leu Ser Phe Arg Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Pro He Phe Tyr Met Val Asp Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: B amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Pro He Phe Tyr Asp Val Asp Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132 : 

Val Gly He Asp Asp His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

Val Gly He Asp Thr His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

Phe Leu Asp He Ala Cys Phe 
1 5 

(2) INFORMATION FOR SEQ ID NO: 135: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

Met His Asp Gin Leu Arg Asp Met Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

Met His Asp Leu lie Gin Asp Met Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xl) SEQUENCE DESCRIPTION t SEQ ID NO: 137: 

Met His Asp Leu lie Gin Asp Met Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 138: 
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Ser Lys Leu Glu Ser Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO; 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Gly Leu Hi a Ser Leu Glu Tyr Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 6 base pairs 
(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

Gly Leu Arg Ser Leu Glu lie Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3432 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 



(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 141: 






ACAAGTAAAA 


GAAAGAGCGA 


GAAATCATCG 


AAATGGATTT 


CATCTCATCT 


CTTATCGTTG 


60 


GCTGTGCTCA 


GGTGTTGTGT 


GAATCTATGA 


ATATGGCGGA 


GAGAAGAGGA 


CATAAGACTG 


120 


ATCTTAGACA 


AGCCATCACT 


GATCTTGAAA 


CAGCCATCGG 


TGACTTGAAG 


GCCATACGTG 


180 


ATGACCTGAC 


TTTACGGATC 


CAACAAGACG 


GTCTAGAGGG 


ACGAAGCTGC 


TCAAATCGTG 


240 


CCAGAGAGTG 


GCTTAGTGCG 


GTGCAAGTAA 


CGGAGACTAA 


AACAGCCCTA 


CTTTTAGTGA 


300 


GGTTTAGGCG 


TCGGGAACAG 


AGGACGCGAA 


TGAGGAGGAG 


ATACCTCAGT 


TGTTTCGGTT 


360 


GTGCCGACTA 


CAAACTGTGC 


AAGAAGGTTT 


CTGCCATATT 


GAAGAGCATT 


GGTGAGCTGA 


420 


GAGAACGCTC 


TGAAGCTATC 


AAAACAGATG 


GCGGGTCAAT 


TCAAGTAACT 


TGTAGAGAGA 


480 


TACCCATCAA 


GTCCGTTGTC 


GGAAATACCA 


CGATGATGGA 


ACAGGTTTTG 


GAATTTCTCA 


540 
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GTGAAGAAGA 


AGAAAGAGGA 


ATCATTGGTG 


TTTATGGACC 


TGGTGGGGTT 


GGGAAGACAA 


600 


CGTTAATGCA 


GAGCATTAAC 


AACGAGCTGA 


TCACAAAAGG 


ACATCAGTAT 


GATGTACTGA* 


660 


TTTGGGTTCA 


AATGTCCAGA 


GAATTCGGCG 


AGTGTACAAT 


TCAGCAAGCC 


GTTGGAGCAC 


720 


GGTTGGGTTT 


ATCTTGGGAC 


GAGAAGGAGA 


CCGGCGAAAA 


CAGAGCTTTG 


AAGATATACA 


780 


GAGCTTTGAG 


AGAGAAAOGT 


TTCTTGTTGT 


TGCTAGATGA 


GTCTGGGAAG 


AGATAGACTT 


840 


GGAGAAAACT 


GGAGTTCCTC 


GACCTTGACA 


GGGAAAACAA 


ATGCAAGGTG 


ATGTTCACGA 


900 


CACGGTCTAT 


AGCATTATGC 


AACAATATGG 


GTGCGGAATA 


CAAGTTGAGA 


GTGGAGTTTC 


960 


TGGAGAAGAA 


ACACGCGTGG 


GAGCTGTTCT 


GTAGTAAGGT 


ATGGAGAAAA 


GATCTTTTAG 


1020 


AGTCATCATC 


AATTCGCCGG 


CTCGCGGAGA 


TTATAGTGAG 


TAAATGTGGA 


GGATTGCGAC 


1080 


TAGCGTTGAT 


CACTTTAGGA 


GGAGCCATGG 


CTCATAGAGA 


GACAGAAGAA 


GAGTGGATCC 


1140 


AXGCTAGTGA AGTTCTGACT 


AGATTTCCAG 


CAGAGATGAA 


GGGTATGAAC 


TATGTATTTG 


1200 


CCCTTTTGAA 


ATTCAGCTAC 


GACAACCTCG 


AGAGTGATCT 


GCTTCGGTCT 


TGTTTCTTGT 


1260 


ACTGCGCTTT 


ATTCCCAGAA 


GAACATTGTA 


TAGAGATCGA 


GCAGCTTGTT 


CAGTACTGGG 


1320 


TCGGCGAAGG 


GTTTCTCACC 


AGCTCCCATG 


GCGTTAACAC 


CATTTACAAG 


GGATATTTTC 


1380 


T CATTGGGGA 


TCTGAAAGCG 


GCATGTTTGT 


TGGAAACCGG 


AGATGAGAAA 


ACACAGGTGA 


1440 


AGATGCATAA 


TGTGGTCAGA 


AGCTTTGCAT 


TGTGGATGGC 


ATCTGAACAG 


GGGACTTATA 


1500 


AGGAGCTGAT 


CCTAGTTGAG 


CCTAGCATGG 


GACATACTGA 


AGCTCCTAAA 


GCAGAAAACT 


1560 


GGCGAGAAGC 


TTGGTGATCT 


CATTGTTAGA 


TAACAGAATC 


CAGACCTTGC 


CTGAAAAACT 


1620 


CATATGCCCG 


AAACTGACAA 


CACTGATGCT 


CCAACAGAAC 


AGCT CTTTGA 


AGAAGATTCC 


1680 


AACAGGGTTT 


TTCATGCATA 


TGCCTGTTCT 


CAGAGTCTTG 


GACTTGTCGT 


TCACAAGTAT 


1740 


CACTGAGATT 


CCGTTGTCTA 


TCAAGTATTT 


GGTGGAGTTG 


TATC AT CTGT 


CTATGTCAGG 


1800 


AACAAAGATA 


AGTGTATTGC 


CACAGGAGCT 


TGGGAATCTT 


AGAAAACTGA 


AGCATCTGGA 


1860 


CCTACAAAGA 


ACTCAGTTTC 


TTCAGACGAT 


CCCACGAGAT 


GCCATATGTT 


GGCTGAGCAA 


1920 


GCTCGAGGTT 


CTGAACTTGT 


ACTACAGTTA 


CGCCGGTTGG 


GAACTGGAGA 


GCTTTGGAGA 


1980 


AGATGAAGCA 


GAAGAACTCG 


GAXTCGCTGA 


CTTGGAATAC 


TTGGAAAACC 


TAACCACACT 


2040 


CGGTATCACT 


GTTCTCTCAT 


TGGAGACCCT 


AAAAACrCTC 


TTCGAGTTCG 


GTGCTTTGCA 


21 00 

A W \J 


TAAACATATA 


CAGCATCTCC 


ACGTTGAAGA 


GTGCAATGAA 


CTCCTCTACT 


TCAATCTCCC 


2160 


ATCACTCACT 


AACCATGGCA 


GGAACCTGAG 


AAGACTTAGC 


ATTAAAAGTT 


GCCATGACTT 


2220 


GGAGTACCTG 


GTCACACCCG 


CAGATTTTGA 


AAATGATTGG 


CTTCCGAGTC 


TAGAGGTTCT 


2280 


GACGTTACAC 


AGCCTTCACA 


ACTTAACCAG 


AGTGTGGGGA 


AATTCTGTAA 


GCCAAGATTG 


2340 


TCTGCGGAAT 


ATCCGTTGCA 


TAAACATTTC 


ACACTGCAAC 


AAGCTGAAGA 


ATGTCTCATG 


2400 


GGTTCAGAAA 


CTCCCAAAGC 


TAGAGGTGAT 


TGAACTGTTC 


GACTGCAGAG 


AGATAGAGGA 


2460 
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ATTGATAAGC 


GAACACGAGA 


GTCCATCCGT 


CGAAGATCCA 


ACATTGTTCC 


CAAGCCTGAA 


2520 


GACCTTGAGA 


ACTAGGGATC 


TGCCAGAACT 


AAACAG.CATC 


CTCCCATCTC 


GATTTTCATT* 


2580 


CCAAAAAGTT 


GAAACATTAG 


TCATCACAAA 


TTGCCCCAGA 


GTTAAGAAAC 


TGCCGTTTCA 


2640 


GGAGAGGAGG 


ACCCAGATGA 


ACTTGCCAAC 


AGTTTATTGT 


GAGGAGAAAT 


GGTGGAAAGC 


2700 


ACTGGAAAAA 


GTTGAAACAT 


TAGTCATCAC 


AAATTGCCCC 


AGAGTTAAGA 


AACTGCCGTT 


2760 


TCAGGAGAGG 


AGGACCCAGA 


TGAACTTGCC 


AACAGTTTAT 


TGTGAGGAGA 


AATGGTGGAA 


2820 


AGCACTGGAA 


AAAGATCAAC 


CAAACGAAGA 


GCTTTGTTAT 


TTACCGCGCT 


TTGTTCCAAA 


2880 


TTGATATAAG 


AGCTAAGAGC 


ACTCTGTACA 


AATATGTCCA 


TTCATAAGTA 


GCAGGAAGCC 


2940 


AGGAAGGTTG 


TTCCAGTGAA 


GTCATCAACT 


TTCCACTAGA 


CCACAAAACT 


AGAGATTATG 


3000 


TAAXCATAAA 


AACCAAACTA 


TCCGCGATCA 


AATAGATCTC 


ACGACTATGA 


GGACGAAGAC 


3060 


TCACCGAGTA 


TCGT CGATAT 


AGAAACTCCA 


AGCTCCAGTT 


CCGATCAGTG 


AAGACGAACA 


3120 


AGTTTATCAG 


ATCTCTGCAA 


CAATTCTGGG 


AATCGTCACC 


TCAGATTAGA 


CCTCCAGTAA 


3180 


GAAGTGAGAA 


AGCATGGACG 


ACGACTGTGA 


AGAATTGAGC 


TAATGAGCTG 


AACCGGATCC 


3240 


GGTGAAATTG 


CAGAACCGGA 


TCGGAGAAGA 


AGAATTTTGC 


ATTTGTGCAT 


CTTTATTTTT 


3300 


AATTGTTACG 


TTTGAGCCCC 


AATAATCATA 


GATATTGTAG 


TGAAGACCAA 


ATTTCATGGT 


3360 


GGATCAATCA 


AATTGTATTT 


TCAAATTTTC 


GTAGTGTAAT 


AACGGAAAAA 


GGAATAAAAA 


3420 


GGTCACTGAG 


TA 










3432 



<2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 909 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

Met Asp Phe lie Ser Ser Leu He Val Gly Cys Ala Gin 
15 10 

Glu Ser Met Asn Met Ala Glu Arg Arg Gly His Lys Thr 
20 25 

Gin Ala He Thr Asp Leu Glu Thr Ala He Gly Asp Leu 
35 40 45 

Arg Asp Asp Leu Thr Leu Arg He Gin Gin Asp Gly Leu 
50 55 60 

Ser Cys Ser Asn Arg Ala Arg Glu Trp Leu Ser Ala Val 
65 70 75 



Val Leu Cys 
15 

Asp Leu Arg 
30 

Lys Ala He 
Glu Gly Arg 



Gin Val Thr 
80 



Glu Thr Lys Thr Ala Leu Leu Leu Val Arg Phe Arg Arg Arg Glu Gin 
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90 



95 



Arg Thr Arg Met Arg Arg Arg Tyr Leu. Ser Cys Phe Gly Cys Ala Asp 
100 105 HO 

Tyr Lys Leu cys Lys Lys Val Ser Ala He Leu Lya Ser He Gly Glu 
115 120 125 

Leu Arg Glu Arg Ser Glu Ala He Lys Thr Asp Gly Gly Ser He Gin 
130 135 140 

Val Thr Cys Arg Glu He Pro He Lys Ser Val Val Gly Asn Thr Thr 
145 150 155 160 

Met Met Glu Gin Val Leu Glu Phe Leu Ser Glu Glu Glu Glu Arg Gly 

165 170 175 

He He Gly Val Tyr Gly Pro Gly Gly Val Gly Lys Thr Thr Leu Met 
180 185 190 

Gin Ser He Asn Asn Glu Leu He Thr Lye Gly His Gin Tyr Asp Val 
195 200 205 

Leu He Trp Val Gin Met Ser Arg Glu Phe Gly Glu Cys Thr He Gin 
210 215 220 

Gin Ala Val Gly Ala Arg Leu Gly Leu Ser Trp Asp Glu Lys Glu Thr 
225 230 235 240 

Gly Glu Asn Arg Ala Leu Lys He Tyr Arg Ala Leu Arg Gin Lys Arg 

245 250 255 

Phe Leu Leu Leu Leu Asp Asp Val Trp Glu Glu He Asp Leu Glu Lys 
260 265 270 

Thr Gly Val Pro Arg Pro Asp Arg Glu Asn Lys Cye Lys Val Met Phe 
275 280 285 

Thr Thr Arg Ser He Ala Leu Cys Asn Asn Met Gly Ala Glu Tyr Lys 
290 295 300 

Leu Arg Val Glu Phe Leu Glu Lys Lys His Ala Trp Glu Leu Phe Cys 
305 310 315 320 

Ser Lys Val Trp Arg Lys Asp Leu Leu Glu Ser Ser Ser He Arg Arg 

325 330 335 

Leu Ala Glu He He Val Ser Lys Cys Gly Gly Leu Pro Leu Ala Leu 
340 345 350 

He Thr. Leu Gly Gly Ala Met Ala His Arg Glu Thr Glu Glu Glu Trp 
355 360 365 

lie His Ala ser Glu Val Leu Thr Arg Phe Pro Ala Glu Met Lys Gly 
370 375 380 

Met Asn Tyr Val Phe Ala Leu Leu Lys Phe Ser Tyr Asp Asn Leu Glu 
385 390 395 400 

Ser Asp Leu Leu Arg Ser Cys Phe Leu Tyr Cys Ala Leu Phe Pro Glu 

405 410 415 

Glu His Ser He Glu He Glu Gin Leu Val Glu Tyr Trp Val Gly Glu 
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420 425 430 

Gly Phe Leu Thr Ser Ser His Gly Val Asn Thr lie Tyr Lys Gly Tyr 
435 440 445 

Phe Leu He Gly Asp Leu Lys Ala Ala Cys Leu Leu Glu Thr Gly Asp 
450 455 460 

Glu Lys Thr Gin Val Lys Met His Asn Val Val Arg Ser Phe Ala Leu 
465 470 475 480 

Trp Met Ala Ser Glu Gin Gly Thr Tyr Lys Glu Leu He Leu Val Glu 

485 490 495 

Pro Ser Met Gly His Thr Glu Ala Pro Lys Ala Glu Asn Trp Arg Gin 
500 505 510 

Ala Leu Val He Ser Leu Leu Asp Asn Arg He Gin Thr Leu Pro Glu 
515 520 525 

Lys Leu He Cys Pro Lys Leu Thr Thr Leu Met Leu Gin Gin Asn Ser 
530 535 540 

Ser Leu Lys Lys He Pro Thr Gly Phe Phe Met His Met Pro Val Leu 
545 550 555 560 

Ara Val Leu Asp Leu Ser Phe Thr Ser He Thr Glu He Pro Leu Ser 

565 570 575 

He Lys Tyr Leu Val Glu Leu Tyr His Leu Ser Met Ser Gly Thr Lys 
580 585 590 

He Ser Val Leu Pro Gin Glu Leu Gly Asn Leu Arg Lys Leu Lys His 
595 600 605 

Leu Asp Leu Gin Arg Thr Gin Phe Leu Gin Thr He Pro Arg Asp Ala 
610 615 620 

He Cys Trp Leu Ser Lys Leu Glu Val Leu Asn Leu Tyr Tyr Ser Tyr 
625 630 635 640 

Ala Gly Trp Glu Leu Gin Ser Phe Gly Glu Asp Glu Ala Glu Glu Leu 

645 650 655 

Gly Phe Ala Asp Leu Glu Tyr Leu Glu Asn Leu Thr Thr Leu Gly He 
660 665 670 

Thr Val Leu Ser Leu Glu Thr Leu Lys Thr Leu Phe Glu Phe Gly Ala 
675 680 685 

Leu His Lys His He Gin His Leu His Val Glu Glu Cys Asn Glu Leu 
690 695 700 

Leu Tyr Phe Asn Leu Pro Ser Leu Thr Asn His Gly Arg Asn Leu Arg 
705 710 715 720 

Arg Leu Ser He Lys Ser Cys His Asp Leu Glu Tyr Leu Val Thr Pro 

725 730 735 

Ala Asp Phe Glu Asn Asp Trp Leu Pro Ser Leu Glu Val Leu Thr Leu 
740 745 750 

His Ser Leu His Asn Leu Thr Arg Val Trp Gly Asn Ser Val Ser Gin 
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755 760 765 

Asp Cys Leu Arg Asn lie Arg Cys lie Asn lie Ser His Cys Asn Lys 
770 775 780 

Leu Lys Asn Val Ser Trp Val Gin Lys Leu Pro Lys Leu Glu Val lie 
785 790 795 800 

Glu Leu Phe Asp Cys Arg Glu lie Glu Glu Leu He Ser Glu His Glu 

805 810 815 

Ser Pro Ser Val Glu Asp Pro Thr Leu Phe Pro Ser Leu Lys Thr Leu 
820 825 830 

Arg Thr Arg Asp Leu Pro Glu Leu Asn Ser He Leu Pro Ser Arg Phe 
835 840 645 

Ser Phe Gin Lys Val Glu Thr Leu Val He Thr Asn Cys Pro Arg Val 
850 855 860 

Lys Lys Leu Pro Phe Gin Glu Arg Arg Thr Gin Met Asn Leu Pro Thr 
665 870 875 880 

Val Tyr Cys Glu Glu Lys Trp Trp Lys Ala Leu Glu Lys Asp Gin Pro 

885 890 895 

Asn Glu Glu Leu Cys Tyr Leu Pro Arg Phe Val Pro Asn 
900 905 

(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 

Pro Lys Ala Glu Asn Trp Arg Gin Ala Leu Val He Ser Leu Leu Asp 
1 5 10 15 

Asn Arg He Gin Thr Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 144: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:144: 

Pro Glu Lys Leu lie Cys Pro Lys Leu Thr Thr Leu Met Leu Gin Gin 
15 10 15 

Asn Ser Ser Leu Lye Lys lie 
20 

(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY j linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

Pro Thr Gly Phe Phe Met His Met Pro Val Leu Arg Val Leu Asp Leu 
15 10 15 

Ser Phe Thr Ser lie Thr Glu lie 
20 



(2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

Pro Leu Ser lie Lys Tyr Leu Val Glu Leu Tyr His Leu Ser Met Ser 
15 10 15 

Gly Thr Lys lie Ser Val Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 
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Pro Gin Glu Leu Gly Asn Leu Arg Lys Leu Lys His Leu Asp Leu Gin 
1 5 10 is 

Arg Thr Gin Phe Leu Gin Thr lie 
20 



(2) INFORMATION FOR SEQ ID NO: 148 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:148: 

Pro Arg Asp Ala lie Cys Trp Leu Ser Lys Leu Glu Val Leu Asn Leu 
15 10 15 

Tyr Tyr Ser Tyr Ala Gly Trp Glu Leu Gin Ser Phe Gly Glu Asp Glu 
20 25 30 

Ala Glu Glu Leu Gly 
35 



(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

Phe Ala Asp Leu Glu Tyr Leu Glu Asn Leu Thr Thr Leu Gly lie Thr 
15 10 15 

Val Leu Ser Leu Glu Thr Leu Lys Thr 
20 25 



(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:150: 

Leu Phe Glu Phe Gly Ala Leu His Lys His He Gin His Leu His Val 
15 10 15 

Glu Glu Cys Asn Glu Leu Leu Tyr Phe Asn Leu 
20 25 



(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

Pro Ser Leu Thr Asn His Gly Arg Asn Leu Arg Arg Leu Ser He Lys 
15 10 15 

Ser Cys His Asp Leu Glu Tyr Leu Val Thr 
20 25 

(2) INFORMATION FOR SEQ ID NO: 152: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

Pro Ala Asp Phe Glu Asn Asp Trp Leu Pro Ser Leu Glu Val Leu Thr 
15 10 15 

Leu His Ser Leu His Asn Leu Thr Arg Val Trp Gly Asn 
20 25 

(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

Ser Val Ser Gin Asp Cys Leu Arg Asn lie Arg Cys lie Aan lie Ser 
15 10 15 

His Cys Asn Lys Leu Lys Asn Val Ser Trp Val Gin Lys Leu 
20 25 30 



(2) INFORMATION FOR SEQ ID NO:154: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

Pro Lys Leu Glu Val lie Glu Leu Phe Asp Cy9 Arg Glu lie Glu Glu 
1^5 10 15 

Leu lie Ser Glu His Glu Ser Pro Ser Val Glu Asp 
20 25 

(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

Pro Thr Leu Phe Pro Ser Leu Lys Thr Leu Arg Thr Arg Asp Leu Pro 
15 10 15 

Glu Leu Asn Ser lie Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 
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Pro Ser Arg Phe Ser Phe Gin Lys Val Glu Thr Leu Val lie Thr Asn 
1 5 10 15 

Cys Pro Arg Val Lys Lys Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 157: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5134 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 



AAGCTTTACA 


GATTGGATGA 


TCTCTTAATG 


CATGCTGAAG 


TGACTGCAAA 


AAGGTTAGCA 


60 


ATATTCAGTG 


GTTCTCGTTA 


TGAATATTTC 


ATGAACGGAA 


GCAGCACTGA 


GAAAATGAGG 


120 


CCCTTGTTAT 


CTGATTTTCT 


GCAAGAGATT 


GAGTCTGTCA 


AGGTAGAGTT 


CAGAAATGTT 


180 


TGCTTGCAAG 


TTCTGGATAT 


ATCACCTTTT 


TCCCTGACAG 


ATGGAGAAGG 


CCTTGTTAAT 


240 


TTCTTATTAA 


AAAACCAGGC 


CAAGGTGCCG 


AATGATGATG 


CTGTTTCTTC 


TGATGGAAGT 


300 


TTAGAGGATG 


CAAGCAGCAC 


TGAGAAAATG 


GGACTTCCAT 


CTGATTTTCT 


CCGAGAGATT 


360 


GAGTCTGTTG 


AGATAAAGGA 


GGCCAGAAAA 


TTATATGATC 


AAGTTTTGGA 


TGCAACACAT 


420 


TGTGAGACGA 


GTAAGCACGA 


TGGAAAAAGC 


TTTATCAACA 


TTATGTTAAC 


CCAACAGGAC 


480 


AAGGTGCTGG 


ACTATGATGC 


TGGTTCAGTG 


TCTTATCTTC 


TTAACCAAAT 


CTCAGTAGTT 


540 


AAAGACAAAA 


TATTGCACAT 


TGGCTCTTTA 


CTTGTAGATA 


TTGTACAGTA 


CCGGAATATG 


600 


CATATAGAAC 


TTACAGATCT 


CGCTGAACGT 


GTTCAAGATA 


AAAACTACAT 


TOGTTTCTTC 


660 


TCTGTCAAGG 


GTTATATTCC 


TGCTTGGTAT 


TACACACTAT 


ATCTCTCTGA 


TGTCAAGCAA 


720 


TTGCTTAAGT 


TTGTTGAGGC 


AGAGGTAAAG 


ATTATTTGTC 


TGAAAGTACC 


AGATTCTTCA 


780 


AGTTATAGCT 


TCCCTAAGAC 


AAATGGATTA 


GGATATCTCA 


ATTGCTTTTT 


AGGCAAATTG 


840 


GAGGAGCTTT 


TACGTTCTAA 


GCTCGATTTG 


ATAATCGACT 


TAAAACATCA 


GATTGAATCA 


900 


GTCAAGGAGG 


GCTTATTGTG 


CCTAAGATCA 


TTCATTGATC 


ATTTTTCAGA 


AAGCTATGTT 


960 


GAGCATGATG 


AAGCTTGTGG 


TCTTATAGCA 


AGAGTTTCTG 


TAATGGCATA 


CAAGGCTGAG 


1020 


TATGTCATTG 


ACTCATGCTT 


GGCCTATTCT 


CATCCACTCT 


GGTACAAAGT 


TCTTTGGATT 


1080 


TCTGAAGTTC 


TTGAGAATAT 


TAAGCTTGTA 


AATAAAGTTG 


TTGGGGAGAC 


ATGTGAAAGA 


1140 


AGGAACACTG 


AAGTTACTGT 


GCATGAAGTT 


GCAAAGACTA 


CCACTAATGT 


AGCACCATCT 


1200 


TTTTCAGCTT 


ATACTCAAAG 


AGCAAACGAA 


GAAATGGAGG 


GTTTTCAGGA 


TACAATAGAT 


1260 
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GAATTAAAGG 


ATAAACTACT 


TGGAGGATCA 


CCTGAGCTTG 


ATGTCATCTC 


AATCGTTGGC 


1320 


ATGCCAGGAT 


TGGGCAAGAC 


TACACTAGCA AAGAAGATTT 


ACAATGATCC 


AGAAGTCACC ' 


1380 


TCTCGCTTCG 


ATGTCCATGC 


TCAATGTGTT 


GTGACTCAAT 


TATATTCATG 


GAGAGAGTTG 


1440 


TTGCTCACCA 


TTTTGAATGA 


TGTGCTTGAG 


CCTTCTGATC 


GCAATGAAAA AGAAGATGGA 


1500 


GAAATAGCTG 


ATGATCTACG 


CCGATTTTTG 


TTGACCAAGA 


GATTCTTGAT 


TCTCATTGAT 


1560 


GATGTGTGGG 


ACTATAAAGT 


GTGGGACAAT 


CTATGTATGT 


GCTTCAGTGA 


TGTTTCAAAT 


1620 


AGGAGTAGAA 


TTATCCTAAC 


AACCCGCTTG 


AATGATGTCG 


CCGAATATGT 


CAAATGTGAA 


1680 


AGTGATCCCC 


ATCATCTTCG 


TTTATTCAGA 


GATGACGAGA 


GTTGGACATT 


ATTACAGAAA 


1740 


GAAGTCTTTC 


AAGGAGAGAG 


CTGTCCACCT 


GAACTTGAAG 


ATGTGGGATT 


TGAAATATCA 


1800 


AAAAGTTGTA 


GAGGGTTGCC 


TCTCTCAGTT 


GTGTTAGTAG 


CTGGTGTTCT 


GAAACAGAAA 


1660 


AAGAAGACAC 


TAGATTCATG 


GAAAGTAGTA 


GAACAAAGTC 


TAAGTTCCCA 


GAGGATTGGC 


1920 


AGCTTGGAAG 


AGAGCATATC 


TATAATTGGA 


TTCAGTTACA 


AGAATTTACC 


ACACTATCTT 


1980 


AAGCCTTGTT 


TTCTCTATTT 


TGGAGGATTT 


TTGCAGGGAA 


AGGATATTCA 


TGACTCAAAA 


2040 


ATGACCAAGT 


TGTGGGTAGC 


TGAAGAGTTT 


GTACAAGCAA 


ACAACGAAAA 


AGGACAAGAA 


2100 


GATACCCGCA 


CAAGGTTTCT 


TGGACGATCT 


TATTGGTAGG 


AATCTGGTGA 


TGGCCATGGA 


2160 


GAAGAGACCT 


AATGCCAAGG 


TGAAAACGTG 


CCGCATTCAT 


GATTTGTTGC 


ATAAATTCTG 


2220 


CATGGAAAAG 


GCCAAACAAG 


AGGATTTCCT 


TCTCCAGATC 


AATAGGTAAA 


AAAAACTGTA 


2280 


TTAATTTTAC 


ATTACAAAAA 


AAAAGAACTG 


TATTAATTTT 


ACTGTATTAT 


GTTTATGCCA 


2340 


ACTCTCATTT 


CCATGTGTTC 


TCTTTTATTC 


AATTCAGTGG 


AGAAGGTGTA 


TTTCCTGAAC 


2400 


GATTGGAAGA 


ATACCGATTG 


TTCGTTCATT 


CTTACCAAGA 


TGAAATTGAT 


CTGTGGCGCC 


2460 


CATCTCGCTC 


TAATGTCCGC 


TCTTTACTAT 


TCAATGCAAT 


TGATCCAGAT 


AACTTGTTAT 


2520 


GGCCGCGTGA 


TATCTCCTTC 


ATTTTTGAGA 


GCTTCAAGCT 


TGTTAAAGTG 


TTGGATTTGG 


2580 


AATCATTCAA 


CATTGGTGGT 


ACTTTTCCCA 


TTGAAACACA 


ATATCTAATT 


CAGATGAAGT 


2640 


ACTTTGCGGC 


CCAAACTGAT 


GCAAATTCAA 


TTCCTTCATC 


TATAGCTAAG 


CTTGAAAATC 


2700 


TTGAGACTTT 


TGTCGTAAGA 


GGATTGGGAG 


GAGAGATGAT 


ATTACCTTGT 


TCACTTCTGA 


2760 


AuA X v(a X wAA 




ATACATGTAA 


ATGATCGGGT 


TTCTTTTGGT 


TTGCGTGAGA 


zozu 


ACATGGATGT 


TTTAACTGGT 


AACTCACAAT 


AACCTAATTT 


GGAAACCTTT 


TCTACTCCGC 


2880 


GTCTCTTTTA 


TGGTAAAGAC 


GCAGAGAAGA 


TTTTGAGGAA 


GATGCCAAAA 


TTGAGAAAAT 


2940 


TGAGTTGCAT 


ATTTTCAGGG 


ACATTTGGTT 


ATTCAAGGAA 


ATTGAAGGGT 


AGGTGTGTTC 


3000 


GTTTTCCCAG 


ATTAGATTTT 


CTAAGTCACC 


TTGAGTCCCT 


CAAGCTGGTT 


TCGAACAGCT 


3060 


ATCCAGCCAA 


ACTTCCTCAC 


AAGTTCAATT 


TCCCCTCGCA 


ACTAAGGGAA 


CTGACTTTAT 


3120 


CAAAGTTCCG 


TCTACCTTGG 


ACCCAAATTT 


CGATCATTGC 


AGAACTGCCC 


AACTTGGTGA 


3180 
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TTCTTAAGTT 


ATTGCTCAGA 


GCCTTTGAAG 


GGGATCACTG 


GGAAGTGAAA 


GATTCAGAGT 


3240 


TCCTAGAACT 


CAAATACTTA 


AAACXGGACA 


ACCTCAAAGT 


TGTACAATGG 


TCCATCTCTG * 


3300 


ATGATGCTTT 


TCCTAAGCTT 


GAACATTTGG 


TTTTAACGAA 


ATGTAAGCAT 


CTTGAGAAAA 


3360 


TCCCTTCTCG 


TTTTGAAGAT 


GCTGTTTGTC 


TAAATAGAGT 


TGAGGTGAAC 


TGGTGCAACT 


3420 


GGAATGTTGC 


CAATTCAGCC 


GAAGATATTC 


AAACTATGCA 


ACATGAAGTT 


ATAGCAAATG 


3480 


ATTCATTCAC 


AGTTACTATA 


CAGCCTCCAG 


ATTGGTCTAA 


AGAACAGCCC 


CTTGACTCTT 


3540 


AGCAAAGGTT 


TGTTCTTGCT 


GTGTTCATCC 


AAGTGCATTT 


AACATTTATT 


CATTTTGTTT 


3600 


TACACCAGAA 


CATGTTTATT 


TTGCTAGTAT 


TACTTGATAC 


ATTAAAAGAA 


ATCGAACTCA 


3660 


TATTTCTGCT 


ACAGTCTTAA 


CTTTTCTTGG 


GCTTACTTGA 


GGTCTAGATT 


AGATCAATGG 


3720 


TTCATGTAAT 


TTTTAATTCA 


CTGTTTCATT 


CAACTGTCTT 


ATGATAGTTG 


TGAAATGACA 


3780 


ATATTGTTAT 


CCCTAGCCAA 


ATTTATTATG 


TTCAAATGAA 


AACTGATGTC 


ACAACTACTT 


3840 


TTTTGTGAAA 


TGTTTTTGAA 


TTTTTTGCTA 


TAAAATTGAC 


GAATTGACAG 


CTTCTATATT 


3900 


TGTCAGCTAA 


ACTCTTTGTC 


ACCAGAAGTG 


TATTTAGAAT 


TACTGTGGTT 


TTATGAAAGA 


3960 


GTTCTGTAGA 


ATTTTATGCT 


TTTGCAGAAT 


ATAGTTTAAA 


ACAACAACAC 


TTCTCTGTTT 


4020 


CAGAGAXAGC 


AGAAGCTAAA 


GTTCAAGGCA 


TTTTGTTTAT 


TTCTAGAACA 


AGTGGAGTTC 


4080 


TTATGTTGAA 


TTCTTGAAAA 


GAAGAAGAAT 


CAGGAGCAGG 


TAAAGTTATC 


TCTTTTTATG 


4140 


TTTTTCTTCT 


TTTAGATGTT 


ATTTCTTCAT 


CTTGAACGTG 


AACACCGCTG 


AAAGCATTTT 


4200 


AATAAAACCG 


GAGAGAAAAA 


TAAGATCTTT 


TTATATAAAG 


CATTATCATG 


TAAATATGCC 


4260 


TAAATCCATA 


TGGTACAACT 


GTTTGACAAA 


ATGATAGAGA 


GGGGAGTTTT 


ATAGTATAAG 


4320 


TAAAACAGGA 


TTGAGAAAAA 


AATCCTTGCA 


CGATTTTCAA 


TTTCTGGCCA 


CATCACAATG 


4380 


TGTGTCAAAG 


TTCCCCTCTT 


TAAGTGGAAC 


AAGCAATCAG 


AAAAGCTCAT 


TCTTATCGGT 


4440 


GACATACCAA 


TACCAGCTGA 


CTGTCTCATC 


TTGGTTAACT 


TAGCCTTGCT 


TACTTAGACT 


4500 


ATTAGATTAG 


TTACTAATGA 


ACTGGTAAAT 


TGGAACCAAA 


TGTAGTTAGC 


TTGATGAGCT 


4560 


GGTAGACATG 


TATATATGAA 


GATACACGCG 


TAACTTTAGT 


CGATGGTTAA 


TTTTTCATTT 


4620 


TTGATTTTTT 


TTCTTCACAG 


AGTATATATG 


AACTTGGCCT 


AAAAGTTTTG 


CTTCACTAAT 


4680 


TTAACTATTA 


CCGTGGATGA AACAAGCATG 


GCAACATTTT 


CAACAACTAT 


CACTCAAGCA 


4740 


ATGTAAAAAA 


TGGAGGTTCT 


ACGAGCGGTA 


CATGTAAGAG 


TTTTGTGCAC 


ACAAGAGGTT 


4800 


CTGAGACTTG 


AACCATCCAT 


GTCCAAGGCA 


GTTGAGATGC 


TAGTAAAGAA 


AGAAGAAGAT 


4860 


GAGCCTGCAC 


TAATTAATCT 


CCCTGTATGA 


ATGAGAGAAT 


GAGAAAAAGA 


TGGAGCTTCA 


4920 


TGAACCAAAA 


GTTACCTTTT 


TTTTTTCTTC 


TTAATGGCAT 


TACTTTGAAG 


CACATGTTTG 


4980 


TTAGTTGTAA 


ATTGTAATGG 


TGAAGTGTTT 


GTAAATATAG 


GGAGTGATAT 


TTGAAAGAAT 


5040 


GGTTGTGTTA 


TCTTTACAAA 


CCGGAATCAT 


TTCTGTATAA 


TTTTCTTCTG 


TAATTTTTGG 


5100 
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TTTCGGTTTA TTCATTACTC ATTTCAGTAA GCTT 



(2) INFORMATION FOR SEQ ID NO: 158: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 
GGNATGGGNG GNNTNGGNAA RACNAC 
(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 
NCGNGWNGTN AKDAWNCGNA 
(2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 
GGWNTBGGWA ARACHAC 

(2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161 
NRYNRDNGTN GTYTTNCCNA NNCCNNSNRK NCC 
(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:162 
GGNMYNSSNG GNNTNGGNAA RACNAC 
(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163 
TYGAYGAYRT BRA 

(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164 
TYCCAVAYRT CRTCNA 

(2) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 
VYMNAYRTCR TCNADNAVNA NNARNA 
(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 
WWNMRRDTNY TNNTNBTNHT NNARNA 
(2) INFORMATION FOR SEQ ID NO: 167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:167: 
NCGNGWNGTN AKDWNCGNGA 
(2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
( D ) TOPOLOGY : 1 ine ar 

(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 
NCKNSWNGTN ADDATDAATN G 
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(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169 
NARNGGNARN CC 

(2) INFORMATION FOR SEQ ID NOtl70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170 
GGWYTBCCWY TBGCHYT 

(2) INFORMATION FOR SEQ ID NO: 171: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:171 
ARDGCVARWG GVARNCC 

(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172 
NRNNWYNAVN SHNARNGGNA RNCC 
(2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173 
GGNYTNCCNY TNDSNBT 

(2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 
ARRTTRTCRT ADSWRAWYTT 
(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 
ARNYYNTYRT ANSRNANNYY 
(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 176 
RRNWTHWSNT AYRANRVNY 
(2) INFORMATION FOR SEQ ID NOtl77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

<ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177 
GTNTTYYTNW SNTTYMGRGG 
(2) INFORMATION FOR SEQ ID NO: 178 i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178 

CCNATHTTYT AYRWBGTNGA YCC 

(2) INFORMATION FOR SEQ ID NO: 179: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 base pairs 
iB) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179 
GTNGGNATHG AYRMNCA 

(2) INFORMATION FOR SEQ ID NO: 180: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180 
RAARCANGCD ATRTCNARRA A 
(2) INFORMATION POR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181 
TTYYTNGAYA THGCNTGYTT 
(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182 
CCCARTCYYK NADNWRRTCR TGCAT 
(2) INFORMATION FOR SEQ ID NO: 183 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183 
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ATGCAYGAYY WNHTNMRRGA YATGGG 

(2) INFORMATION FOR SEQ ID NO: 184: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 
NARNSWYTYN ARYTT 

(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 
WSNAARYTNR ARWSNYT 

(2) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 
DWWYTCNARN SWNYKNARNC C 
(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187 : 



GGNYTNMRNW NNYTNGA 17 
(2) INFORMATION FOR SEQ ID NO: 188 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

Leu Lys Phe Ser Tyr Asp Asn Leu Glu Ser Asp Leu Leu 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 189: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

Gly Val Tyr Gly Pro Gly Gly Val Gly Lys Thr Thr Leu Met Gin Ser 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 
Gly Gly Leu Pro Leu Ala Leu lie Thr Leu Gly Gly Ala Met 

(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 



WO 95/28423 



PCT/US95/04589 



- 144 - 

(C) STRANDED NESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

( ix ) FEATURE : 

(A) NAME/KEY t Modif ied-site 

(B) LOCATION: 2 

(D) OTHER INFORMATION: /note= "Xaa is Met or Pro" 

( ix ) FEATURE : 

(A) NAME /KEY : Modif ied-site 

(B) LOCATION: 3 

(D) OTHER INFORMATION: /note= "Xaa is Gly or Pro" 

( ix ) FEATURE : 

(A) NAME /KEY: Modif ied-site 

( B ) LOCAT ION i S 

(D) OTHER INFORMATION: /note* "Xaa is He, Leu or Val" 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 10 

(D) OTHER INFORMATION: /note= -Xaa is He, Leu or Thr" 

(ix) FEATURE: 

(A) NAME /KEY : Modif ied-site 

(B) LOCATION: 11 

(D) OTHER INFORMATION: /note* "Xaa is Ala or Met" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191 : 

Gly Xaa Xaa Gly Xaa Gly Lys Thr Thr Xaa Xaa 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS t not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(ix) FEATURE: 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /note= "Xaa is Phe or Lys" 

( ix ) FEATURE : 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 2 

(D) OTHER INFORMATION: /note* "Xaa is Arg or Lys" 

(ix) FEATURE: 

(A) NAME /KEY : Modif ied-site 

(B) LOCATION: 3 

(D) OTHER INFORMATION t /note= "Xaa is He, Val or Phe" 
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( ix ) FEATURE : 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 5 

(D) OTHER INFORMATION: /note= "Xaa is He, Leu or Val w 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 6 

(D) OTHER INFORMATION: /note= "Xaa is He or Leu" 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 7 

(D) OTHER INFORMATION: /note= "Xaa is He or Val" 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 10 

(D ) OTHER INFORMATION: /note= "Xaa is He, Leu or Val" 

(ix) FEATURE: 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 11 

(D) OTHER INFORMATION: /note= "Xaa is Asp or Trp" 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 192: 

Xaa Xaa Xaa Leu Xaa Xaa Xaa Asp Asp Xaa Xaa 
15 10 

(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



( ix ) FEATURE : 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /note= "Xaa is Ser or Cys" 

* (ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 2 

(D) OTHER INFORMATION: /note= "Xaa is Arg or Lys" 

( ix ) FEATURE : 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 3 

(D) OTHER INFORMATION: /note= "Xaa is Phe, He or Val" 

( ix ) FEATURE : 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 4 

(D) OTHER INFORMATION: /note* "Xaa is He or Met" 
(ix) FEATURE: 
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(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 5 

(D) OTHER INFORMATION: /note* "Xaa is He, Leu or Phe" 

( ix ) FEATURE : 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 7 

(D) OTHER INFORMATION : /note= "Xaa is Ser, Cys or Thr" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:193: 

Xaa Xaa Xaa Xaa Xaa Thr Xaa Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



( ix) FEATURE t 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 5 . 

(D) OTHER INFORMATION: /note= "Xaa is Thr, Ala or Thr" 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 6 

(D) OTHER INFORMATION ; /note= "Xaa is Leu or Val M 

( ix ) FEATURE : 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 7 

(D) OTHER INFORMATION: /note= "Xaa is He, Val or Lys" 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 8 

(D ) OTHER INFORMATION: /note= "Xaa is Val or Thr" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

Gly Leu Pro Leu Xaa Xaa Xaa Xaa 
1 . 5 

(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /note= "Xaa is Lys or Gly w 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 2 

(D) OTHER INFORMATION: /note* "Xaa is lie or Phe" 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 5 

(D) OTHER INFORMATION: /note= "Xaa is Asp or Lys w 

(ix) FEATURE: 

(A) NAME /KEY: Modif ied-site 

(B) LOCATION: 6 

(D) OTHER INFORMATION: /note= "Xaa is Ala, Gly or Asn" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195 : 

Xaa Xaa Ser Tyr Xaa Xaa Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 196 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 

Asn Ser His Arg 
1 

(2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

Arg Asp Arg Arg Arg Val Asp Pro Cys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 198: 
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(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH:- 4 amino acids 
(8) TYPE; amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Thr Gly Asp Leu 
1 

(2) INFORMATION FOR SEQ ID NO: 199 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 amino acids 
(8) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 

His Gly Thr Tyr 
1 

(2) INFORMATION FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 

Arg Met Ser His Gly Phe Arg Asn Ser Gin Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 

Gly Glu Met Val Glu Ser Thr Gly Lys Arg Ser Thr Lys Arg Arg Ala 
15 10 15 

Leu Leu Phe Thr Ala Leu Cys Ser Lys Leu lie 
20 25 



WO 95/28423 



PCT/US95/04589 



- 150 - 

Claims 

1. A substantially pure oligonucleotide 
comprising the sequence: 

5' GGNATGGGNGGNNTNGGNAARACNAC 3', [SEQ. ID NO: 158] 
5 wherein N is A, T, G, or C; and R is A or G. 

2 . A substantially pure oligonucleotide 
comprising the sequence: 

5' NARNGGNARNCC 3', [SEQ. ID NO: 169] wherein N is 
A, T, G or C; and R is A or G. 

10 3. A substantially pure oligonucleotide 

comprising the sequence: 

5 ' NCGNGWNGTNAKDAWNCGNGA 3', [SEQ. ID NO: 159] 
wherein N is A, T, G or C; W is A or T; D is A, G, or T; 
and K is G or T. 

15 4. A substantially pure oligonucleotide 

comprising the sequence: 

5' GGWNTBGGWAARACHAC 3', [SEQ ID NO: 160] wherein 
N is A, T, G or C; R is G or A; B is C, G, or T; H is A, 
C, or T; and W is A or T. 

20 5. A substantially pure oligonucleotide 

comprising the sequence: 

5' TYGAYGAYRTBKRBRA 3', [SEQ. ID NO: 163] wherein 
. R is G or A; B is C, G, or T; D is A, G, or T; Y is T or 
C; and K is G or T. 

25 6. A substantially pure oligonucleotide 

comprising the sequence: 

5' TYCCAVAYRTCRTCNA 3', [SEQ ID NO: 164] wherein N 
is A, T, G or C; R is G or A; V is G or C or A; ajid Y is 
T or C. 
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7. A substantially pure oligonucleotide 
comprising the sequence: 

5' GGWYTBCCWYTBGCHYT 3', [SEQ ID NO. : 170] wherein 
B is C, G, or T; H is A, C, or T; W is A or T; and Y is T 
5 or C. 

8. A substantially pure oligonucleotide 
comprising the sequence: 

5' ARDGCVARWGGVARNCC 3', [SEQ ID NO: 171] wherein 
N is A, T, G or C; R is G or A; W is A or T; D is A, G, 
10 or T; and V is G, C, or A. 

9. A substantially pure oligonucleotide 
comprising the sequence: 

5' ARRTTRT CRTAD SWRAW YTT 3', [SEQ ID NO: 174] 
wherein R is G or A; W is A or T; D is A, G, or T; S is 
15 G or C; and Y is C or T. 

10. A recombinant plant gene comprising the DNA 
sequence: 

5' GGNATGGGNGGNNTNGGNAARACNAC 3', [SEQ ID NO: 158] 
wherein N is A, T, G or C; and R is A or G. 

20 ii. The gene of claim 10, further comprising the 

sequence: 

5' NARNGGNARNCC 3', [SEQ ID NO: 169] wherein N is 
A, T, G or C; and R is A or G* 

12. The gene of claim 11, further comprising the 
25 sequence: 

5' NCGNGWNGTNAKDAWNCGNGA 3', [SEQ ID NO: 167] 
wherein N is A, T, G or C; W is A or T; D is A, G or T; 
and K is G or T. 



SUBSTITUTE SHEET (RULE 26) 
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13. A recombinant plant gene comprising a 
combination of any two or more sequences of claims io, 11 
and 12. 

14. A substantially pure plant polypeptide 
5 comprising the amino acid sequence: 

Gly Xaa x Xaa 2 Gly Xaa 3 Gly Lys Thr Thr Xaa 4 Xaa 5 , 
[SEQ ID NO: 191]/ wherein Xaa x is Met or Pro; Xaa 2 is Gly 
or Pro; Xaa 3 is lie, Leu, or Val; Xaa 4 is lie, Leu, or 
Thr; and Xaa 5 is Ala or Met. 

10 15. A substantially pure plant polypeptide 

comprising the amino acid sequence: 

Xaa x Xaa 2 Xaa 3 Leu Xaa 4 Xaa 5 Xaa 6 Asp Asp Xaa 7 
Xaa 8 , [SEQ ID NO: 192], 

wherein Xaa x is Phe or Lys; Xaa 2 is Arg or Lys; Xaa 3 is 
15 lie, Val, or Phe; Xaa 4 is lie, Leu, or Val; Xaa 5 is lie or 
Leu; Xaa 6 is lie or Val; Xaa 7 is He, Leu, or Val; and 
Xaa 8 is Asp or Trp. 

16. A substantially pure plant polypeptide 
comprising the amino acid sequence: 
20 Xaa x Xaa 2 Xaa 3 Xaa 4 Xaa 5 Thr Xaa 6 Arg, [SEQ ID NO: 

193] 

wherein Xa.^ is Ser or Cys; Xaa 2 is Arg or Lys; Xaa 3 is 
Phe, lie, or Val; Xa$ 4 is lie, or Met; Xaa 5 is lie, Leu, 
. or Phe; Xaa 6 is Ser, Cys, or Thr. 

25 17. A substantially pure plant polypeptide 

comprising the amino acid sequence: 

Gly Leu Pro Leu Xaa x Xaa 2 Xaa 3 Xaa 4 , [SEQ ID NO: 

194], 

wherein Xaa x is Thr, Ala, or Ser; Xaa 2 is Leu or Val; Xaa 3 
30 is lie, Val, or Lys; and Xaa 4 is Val or Thr. 
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18. A substantially pure plant polypeptide 
comprising the amino acid sequence: 

Xaa x Xaa 2 Ser Tyr Xaa 3 Xaa 4 Leu, {SEQ ID NO: 195], 
wherein Xaa x is Lys or Gly; Xaa 2 is lie or Phe; Xaa 3 is 
5 Asp or Lys; and Xaa 4 is Ala, Gly, or Asn. 



19. A method of isolating a disease-resistance 
gene or fragment thereof from a plant cell, comprising: 

(a) providing a sample of plant cell DNA; 

(b) providing a pair of oligonucleotides 

10 having sequence homology to a conserved region of an EPS 
disease-resistance gene; 

(c) combining said pair of oligonucleotides 
with said plant cell DNA sample under conditions suitable 
for polymerase chain react ion -mediated DNA amplification; 

15 and 

(d) isolating said amplified disease- 
resistance gene or fragment thereof. 



20. The method of claim 19, wherein said 
amplification is carried out using a reverse- 

20 transcription polymerase chain reaction. 

21. The method of claim 19, wherein said reverse- 
transcription polymerase chain reaction is RACE. 

22. A method of identifying a plant disease- 
resistance gene in a plant cell, comprising: 

25 (a) providing a preparation of plant cell DNA; 

(b) providing a detect ably-labelled DNA sequence 
having homology to a conserved region of an RPS gene; 

(c) contacting said preparation of plant cell DNA 
with said detectablly-labelled DNA sequence under 

30 hybridization conditions providing detection of genes 
having 50% or greater sequence identity; and 
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(d) identifying a disease-resistance gene by its 
association with said detectable label. 

23. The method of claim 22, wherein said DNA 
sequence is produced according to the method of claim 19* 

5 24. The method of claim 22, wherein said 

preparation of plant cell DNA is isolated from a plant 
genome . 

25. A method of isolating a disease-resistance 

gene 

10 from a recombinant plant cell library, comprising: 

(a) providing a recombinant plant cell library; 

(b) contacting said recombinant plant cell library 
with a detectably-labelled gene fragment produced 
according to the method of claim 19 under hybridization 

15 conditions providing detection of genes having 50% or 
greater sequence identity; and 

(c) isolating a member of a disease-resistance 
gene by its association with said detectable label. 

26. A method of isolating a disease-resistance 
20 gene from a recombinant plant cell library, comprising: 

(a) providing a recombinant plant cell library; 

(b) contacting said recombinant plant cell library 
. with a detectably-labelled oligonucleotide of any of 

claims 1-9 under hybridization conditions providing 
25 detection of genes having 50% or greater sequence 
identity; and 

(c) isolating a disease-resistance gene by its 
association with said detectable label. 



WO 95/28423 



P CT/U S95AM5S9 



- 155 - 

27. A recombinant plant polypeptide capable of 
conferring disease-resistance wherein said plant 
polypeptide comprises a P-loop domain or nucleotide 
binding site domain • 

5 28 . The recombinant plant polypeptide of claim 

27, wherein said polypeptide further comprises a leucine- 
rich repeating domain. 

29. A recombinant plant polypeptide capable of 
conferring disease-resistance wherein said plant 
10 polypeptide contains a leucine-rich repeating domain. 



30. A plant disease-resistance gene isolated 
according to the method comprising: 

(a) providing a sample of plant cell DNA; 

(b) providing a pair of oligonucleotides having 
15 sequence homology to a conserved region of an RPS 

disease-resistance gene; 

(c) combining said pair of oligonucleotides with 
said plant cell DNA sample under conditions suitable for 
polymerase chain reaction-mediated DNA amplification; and 

20 (d) isolating said amplified disease-resistance 

gene or fragment thereof. 

31. A plant disease-resistance gene isolated 
. according to the method comprising: 

(a) providing a preparation of plant cell DNA; 
25 (b) providing a detectably-labelled DNA sequence 

having homology to a conserved region of an RPS gene; 

(c) contacting said preparation of plant cell DNA 
with said detectably- labelled DNA sequence under 
hybridization conditions providing detection of genes 
30 having 50% or greater sequence identity; and 
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(d) identifying a disease-resistance gene by its 
association with said detectable label. 

32. A plant disease-resistance gene according to 
the method comprising: 

5 (a) providing a recombinant plant cell library; 

(b) contacting said recombinant plant cell library 
with a detectably-labelled gene fragment produced 
according to the method of claims 1-4 under hybridization 
conditions providing detection of genes having 50% or 

10 greater sequence identity; and 

(c) isolating a disease-resistance gene by its 
association with said detectable label. 

33. A method of identifying a plant disease- 
resistance gene comprising: 

15 (a) providing a plant tissue sample; 

(b) introducing by biolistic transformation into 
said plant tissue sample a candidate plant disease- 
resistance gene; 

(c) expressing said candidate plant disease- 
20 resistance gene within said plant tissue sample; and 

(d) determining whether said plant tissue sample 
exhibits a disease-resistance response, whereby a 
response identifies a plant disease-resistance gene. 

34. The method of claim 33, wherein said plant 
25 tissue sample comprises leaf, root, flower, fruit, or 

stem tissue. 



35. The method of claim 33, wherein said 
candidate plant disease-resistance gene is obtained from 
a cDNA expression library. 
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36. The method of claim 33, wherein said disease- 
resistance response is the hypersensitive response. 

37. A plant disease-resistance gene isolated 
according to the method comprising: 

5 (a) providing a plant tissue sample; 

(b) introducing by biolistic transformation into 
said plant tissue sample a candidate plant disease- 
resistance gene; 

(c) expressing said candidate plant disease- 
10 resistance gene within said plant tissue sample; and 

(d) determining whether said plant tissue sample 
exhibits a disease-resistance response, whereby a 
response identifies a plant disease- resistance gene. 

38. A purified antibody which binds specifically 
15 to an rps family protein. 

39. A DNA sequence substantially identical to the 
DNA sequence shown in Figure 12. 

40. A substantially pure polypeptide having a 
sequence substantially identical to a Prf amino acid 

20 sequence shown in Figure 5 (A or B) . 
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c RRRRRKHWCLWTWWGWEOMV- 

TTOATCCAGAGCXTTAXCAACCXCCTUATCAC^ 

€01 ♦ ♦ * * * * €40 

AArrA C& T Crc GTAAT TC ri UCIUl ACTA G lUlTl llw 1^ 1 A CTCATACTACATCACTAA 

* LMQSINNELITKCHQYDVLI 

b ♦CRALTTS»SQXDISHKY*F- 
= HAEH*QRADHXR?SV*CTDL- 

TCGC7TCAAATCTCCAGAGJWVITCKCCAC7 

661 ♦ * * • * 720 

ACCCAAGTTTACAC G T C T C Tr X AGCCSCTCACATCTrXAGTCSTTCSSCAACClCSTGCC 

& WVQMSREFCECTIQQAVCAR 

b CFRCPSKSASVOrSKPLERG- 

C GSNVQRIRRVYMSXSUWSTV- 

TTCGGTTTATCTTGGGACCAGAAGGACACCGGCGAAAACAGAGCTI^ 

721 — - * * - 790 

AACCCAAATAGAACC C rc C T C Tr C CTCTGGCCS C:r:^ 

A LCLSWDEKETGEHRALKIYR 
b WVYLGTRRRPAXTEL*RYTE 
C G F ILGRECORRRQ SFEDIQS- 

CCTTTGAaACACAAACC'lTlVl'll* i'iU llCZTAGATCATClV^JUUAAGASATAGACTTC 
73i — — — -* — — - — ™* 840 

CGAA*CTCTSTCTTTCCAAACAACAXCA^ 

* ALRQXRF11LLDDVWEEZDL 

b L»DRHVSCCC # MMSGRR»TW- 
c FETETFLVVA?. •CLGAOSLC- 
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GJKAAAACTOCACTT CC T CC ACCTCACAGCCAAAACAAATCCAA^^ Z ' l\ 1 A ffT XX I T A . 
141 * * * 900 

LivriTiio ccTCAxsGXccTCSA c ju T c e crr nuTiTx ccrrccxcTxcxA G i unu r 

a KKTOVPRPDRENRCKVHPTT - 

b RXLZPLDLTCXTNAR'CSRK- 
e ENW0SST * Q G KQKQGO'VICDT* 

CCCTCTATX^TTATCCAACAATAlt^CTSCGGAAT^^ 

901 * ♦ ♦ ♦ 960 

GCCAGATATCGTAATACGTTGTTATACCCACCCCTrATGTTCAACTCTC 

* RSIXLCMNMGXSYRURVSPL - 
b OL'KYATtMVRMTS'E WsrW - 
C V Y S I M Q Q Y G C C I Q V E S C V s C - 

GAGAAGAAACACGCGTCGGAGC rJ TT C T S T A GTAAGCTATGGA G AA*^^ 

961 + ♦ — * 1020 

CTCTrCTTTCTCCGCACCCTCGACAAGACATCATC 

* EKKKAWE^rCSXVWRKDLLS - 

b rrntrg$csvvryc£k:f*s- 

c EETRVGA'SL * *6MERRSFRV- 

TCMCATOULCTCCCCGCCTCCCGGAGATTATAGTGAGT^ 

1021 — — — — — — ♦ 1040 

ACTAGTAGTTAACCGGCCCACCCCCTCTAATATCACTCATTTACACCTCCTAACOCW 

ft fiSSIRRCAEIIVSRCGGLPL 

b HHQFXGSRRL* # VMVEDCH*- 

e IIMSFARCOYSE*MWRIATS:- 

CecrreATCACrTTAGGAGCAGCCATGCCrCAT A GACAC AC 

1081 * ♦ * — ♦ 1140 

CCCAACTACTCAAATCrrCCTCMTACCCAGTATCrcrwl^ VZTZZTVZ niACCTAGGTA 

ft AIitTLGGAMAHRETBEEWIH - 

b R»SL # EEFWLIER GRRSCSM- 

CCTAGTCAACTTCTCACTAGATTTCCAGCAGAGATCAAGGCTATCA^ 

1141 ► - * * * * "00 

CGATCACTICAAGACTGATCTAAAeaTCGTCTCTACrTC^ 

ft ASEVLTRF pxemrghnyvfa - 

b LVKF *LOFQQR * RV*TMYLP - 

c • •ssD^XSSRDEGYELiCICP- 

CITITCAAATTCAGCTACGAC AACCTCGAGAGTCAT CTCC TT C G ^Il. ■ ' lU mi I ' ll* 1 A C 

1201 * - * * * "SO 

GXAAACrrTAAGTCGATGCTSrrSGAGCrCTCA^^ 

ft LLRFSYDNt-ES DLLRSCFLY 

b r» N SATT7SRV:CFCLV SCT- 

c FEIQLRQFRE'SASVUTLVL.- 



TCCGCTTTATTCCCAGAAGAACATTCTATAGAGATCGAGCACCTT^rCACTACTCGGTC 

1261 — """" * 

ACGCGAAATAAGG ii ' ll I ' l ' - I ' lU TAACATATCTCTAGCTCSTCCAACAACTCATCACCCAC 

CALFPEEHSlEIEgLVSYVV 

FIG. : CONTINUED 



4 1320 



WO 95/28423 



5/25 



PCIYUS95/04589 



b ALYSQRNIL'RSSSLLSTOS- 
c RPXPRRTFrRDRAAC * V £, <3 R - 

OCCCAAGCS mVlSJ lCCA GC T gCS AlCCC^^ 

1321 ♦ ♦ ♦ + Z~"~~* 1380 

CCC L lTU.C AAACA G T GG T i:G AGGGTACCGCAA inVX N 5 CT^^ 

* OEGFLTSSHOVNT I Y X G Y F L 

b AXGFSPAPKALTPrTRDlFS- 
c RRVSHQLPWR^HKLQGI FSH- 

ATTCGGGATCTGAAAGCGGCA I U ITlVi ' I^^A AACCGGAGATGACAAA^ACACCTCAAC 

TAACCCCTAGA Mnni GCCGTACAAACAAC CrX w llg ^^ 

& IODLXAACLLETGDEXTQVK 

b LGI*KRKVCWX?EKRXKR*R- 

e WGSCSCMrVGNRR*CNTCSO* 

ATSCATAATCTCCrCACAAC ^rn^ C A ' r r c r J CATCCCATCTCAACACS^AC^ 

1441 — — — * 1500 

TACCTATTACACCACTCrrCCAAACCTAACACCrACCSTACACTyCTCCCCTCAATATTC 

a KHNVVRS PAL V» MAS S Q C ? Y K - 

b cimwsealhccvklwrclxr- 

c A*CCQXLCIVD CI«TCOL»C- 

GXGCTGATCCTAGTICAGCCTAGCATGCGACATACTGAAG CTCC r ^ 

1501 ♦ * * ♦ * ♦ 1SC0 

CTCCACTACCATCAACTCSCATCC?ACCCTSTATCA C T ^^ ^U ' i ' iM ' AU ACC 

A ELILVEPSKGKTEAPXAENW - 

b S»S*LSLAWOILXLLXQXTC- 
c ADPS«A«RC-TY«SS«SRXLA- 

CGACAAGCGTICSTGATCTCATTGTrJ^TAACA^^ 

1561 * * «■ * ♦ 1620 

CCTCrrCGCAACCACTACAGTAACAATCTATrGTCrrAGG^^ 

a RQALVI SLLONRZQTLPSXI* 

b DXRW*SKC*XT£SRFCLXNS- 

c TSVCDL JVR'QNPDLA'XTH- 

ATATCCCCSAAACTGACAACACTCATGCTCCAACAGAACAGCTCTITGAA^ 

1621 — ♦ — - * ♦ * ► isao 

tXTAcocs ^Ji 'i a AeicTigWACTxecAG ur^^ 

* XCPXLTTL MLQQNSSLXXIP - 
b Y A R N * Q H *CSNRTAL*RRFQ- 
c MP ET DNTOAPTE Q LFE ZDSM- 

AGXCC ^Vy«"^ ATCCATATGCC" t;T T C T C AGAGTCTTGCA 

1681 1740 

TOTCCCAAAAAGTACGTATACGG ACAAG AGTCTCAGAAC CTC AACAGCAAG7GTTCA7AG 

a TGFFMHMPVLRVLDLSrTSl 

b QGFSCXCLFSESVTCRSQVS- 

c RVFKAYACSQSLCLVVKRYH- 

ACTCAnArrCCSTTCTCTATCAASTATTTCGTCGACrrCTATCATCTC rCTATCTCAGCA 

1741 1800 

TCACTCTAAGCCAACAGATACrrCATAAACCACCTC>ACATAGTAGACACATACXSTCrr 
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* TXIPLS XKYLVZLYRL5XSC 

b LRPRCLSSIWWSCIICLCQ*z. 
C •OSVVYQVr cOVVSSVrYVRM- 

1B01 * > - ^ + 1160 

* TXISVLPQZLCNLRXLKHLD 

b QR*VYCHRSLCXLEN*SXVT- 
<= KDRCIXTCXWES»RTEXSCp. 

CTACAAASAACTCAl»TIllJr:i^^ 
18(1 1920 
GA 1U 1 rivau &CTCXKAC^TCra 

* LQRTQFLQTXPROAXC'tftSX 

b YXELSFFRRSHZKFYVG'AS- 
c TKtfSVS SDDPTRCRMLAEQX- 

CTCGA avrxc T v^c TT ^ T A CTACAcriTAcc^ 

1521 . * ♦ — ♦ ♦ x3eo 

GXCCTCCXXCXCrrSXXCATCXTCTCXXTCCCCCCAACCCTrSXC^^ 

A L.fiVLHLYYSYAGVSLQ£?GE 

b SRP»TCTTVTPVGMCRXLeX- 

c ROSCLVLQLRRLG7A8LWRR:-. 

CATGAAGCACAAGAACTCCC Al ' IlUl IU XCTTGGAATACTTCGAAXX O CTXXCCACXCTC 

1981 ^ -* «r ♦ 2040 

CTA L^:illVniA ' nS XCCCTAAG«^ 

& OEAEELGFADLEYLEHLTTL 

b MXQXHSDSLTWHTWXT *PHS- 

C •SRRTRIR • LGILGRrWHTR* 

CCTATCA L IC TT C TCT C ATTCSAGXCCCTAXXAA L. T Ct r CG XGTrcc C I UC ' iVlX^ XT 

2041 ♦ * * — * 2100 

CCXTXCTSACAAGAGACTAACCTCTGCGATTTTTCJ^ 

a GXTVLSLSTLKTLFEFGALX 

b VSLFSHWRP'XLSSSSVLC I- 

c YHCSLICDPXNSI.RVRCFA*- 

AAXCATXTACAGCATCTCCACCTTGAXGACTCCAATGAX C T CC T C T A CrTCAATCTCCC* 

2101 * * * -r * 2160 

TT1CTATATCTCGTAGXCCTGCAXCTTCTCACGTTACT " 

* KHXQHLHVEECHE LLYFHLP - 
b MIYSISTLXSAMNSST. SISH- 
c TYTASPR-RVQ'TPL^QSPl. - 

TCACTCACTAACCATWCACCAACCTCAGAAC^ 

2161 * — * 2220 

ACTCACTCATTCCTACCC?CCTTOCACT C TT C ?GAATCC^ 

a SLTHHCRNLRRLS IXSCHDL 

b HSLTMACT'EDUALXV AMTW- 

e TH*?WQE?ERT'"K # XiP # LC- 

GACTACCTGCTCACACCCGCX^rrrrSAAAATCA^ ■ 

FXG. 1 CONTINUED 



WO 95/28423 



7/25 



PCT/US95/04589 



2221 — «. o ..i.— 

CTCATOGACCAGTGTCGCXICTTrrAAA^^ 2280 

* *VLVTPXDFEM0WLP3LKvL 
2281 ^ffl^^^^^^^^^ 

TCCAAtcTCTcucAACTCTrcAA nuciv^ 2340 

* T CXSLKNLTIIVVGHSVSQ De . 

° *¥TAFTT»PECCEIL»AKIV- 
c VT «P*QLNQSVGJCFCJCPr Ls . 

CTCCOCAATATCCCnTGCATAAACATTTCACACTtKAACAACCTC 
2341 — — « *w**ws» 

gXCCCCTrATAGGCAACgrATITCTAAACTGTCAC C^ 

* I* R N I R C INlSHCMKLK MVsy - 
e C % C e , ISV A*TFKrATS-RMSHC- 

AEYPLHKHFTLOOXErCLMC. 

2401 ^I3^^ CKCCSJ ^~^ -C ^ 

CAACTCTTTOACCCTriCCATC^ 2460 

* v QKLPKLEVIELr0CRB I E E - 

PRMSQS * R*LNCSTABR * R N - 
c S ETPKARCD*TVRLQKDRCI- 

TTGA T XACC C AACACGAGAGTCCATCCCTCGAAGATCCJUICAT^ 
1 *™ .. ..... ........ ^ 

XXCTATrCGCTICTKTCTCAGGTAGGCAGCTT^ 2520 

* L XSEKESPSVEDPTLPPSLX 

o * * A N T R V H P S X I Q H C S Q A ♦ R - 

c OKRTRES irrrskivfkpeo* 

" ♦ . — ^ ...... ...^ .....^ 2 580 

TCCAACTCrrSATCCCTMACCSTCTrSATTTS?^ 

* TLRTROLPELNSILP S.R F S F - 
b P*ELGICQN*TASSHLDFHS- 
c ^EH'CSARTKQHPPISIFlp. 

CAAAAAGTTGAAACATTAGTCATCACAAAriCCCCCACACrrAAGAAA " 

2581 . <► , 2540 

CTTrriCAACTTTSTAATCAGTAGTCTT^^ 

* QKVETLV XTNCPRVXXL P F Q - 
b XKLKK * S'SQZAPELRNCRFR- 
c KS»NISHHKLPQS»ETAVSC- 



CAGAGGAGGACCCAGATCAACTTOCCAACAGTrTATTSTGAGGAGAAATGCTO 
CTCTCCTCCrSGSTCTACrWAACGCTreTCAAATXACACTCCTCT^ 



4 EftRTQMNLPTVYCSSKWWKA 

b Rgg?r*tcqqf:vrrnccxk- 

c S£DPOELXKSLL»CEMVEST- 
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g^Cw* * * i ii 1 u^i^jLAc^TlII^^J^^ftr^^^^T 3788 

b L uW D r Q p K 8 8 L c * t » * r v p n . 

e 0 X R S T K R R a L L T T A L C S X L I - 

2761 !^I^^fJ^^f^f3^*^TATCTCCATTCATAACTA«a«C»A^ 

ATXTICTCSATTCTCBTOACACAICTTTATACACOTAASTATT^ 2820 

KLRAL C TNMS I H X • C' E A R X « 

2821 ^^^^Vt^l^ 7 ^ 

TCCAACAACCTCACTTCACTAGTTOAAAG^ATC 2880 

c °„ C „ S n S „ S V 1 N 7 ? H S H X T R D Y V I . 

CAtAAAAACCAAACTATCCSCCA 
2881 „„ 

OTATmTCCTTTSATACCCeCT 

* KKHQTIR 
b IKTXLSA 
c *XPNYPR 

Eaxyn«B ehac do cut; 
NONE 

Enzymes that do not cut: 
Xpnl 
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ATCCATTCA AL »L . ~~£TZXC \ l^CACTACTCCATTTT XT. A/^TAC TUCI ACCCC CSCCTC -14 

: 

CXI 11. » J~ AA CC ACJAACGCAC=^7TAA C AA U» «A gISCATCATCAAAATT ; 5 



AU?rev«uuti«A«nMi«s«rPret«us«rAr«CiuV*iProS«rHisAi*AlArro 



AC LUL . . wAACCCCC/IXAXCCCSCSrATTXt. , I ^ . * CTAA C * C A C TACTCOCC.*.CACAC 
S*rAl#5ttxS*rPreC4uThrArQAl«L«uL«uAl*7hri.ysThrV4lL«uCiyArgHis 



'.VaIl«CluV'AlPreAl«?h*CXyC:yTr7rh«Lyauy t.yaSarS«rLyiHisClu7hr 



* IL I . ^ - . - ACSCAC U » « ' ■- ' . » » ^ CSTATCC C AAC C TAAtCACCCAATCSS A .„jU 295 
UuPh«Ar9L«uThrMtsVAl»roiyrv/*U*cCiaSXyAjnCluAr9n«cCi/CysTr? 



taiu^.^scaacaaiu:: r'^i^A, >i ;aacw.«w^c ^.acc^ „- 455 

TyrAl*cy»Al*Ar9««cv*i.ciyMi*s*rV4lCi«Al*clyCrtsArQL«uClyL«u^ro 



rACAASATTrrrrASATCTACAAACB 5 1 $ 



CluL«uTyTCluClyArgCiuAl*?rcAl*ClyL«uCIftAjpPh«S«rAjpv*iCluAz9 



Figure 3- 
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CiuCiuL«uCiyAl*i.«uL«uTyrLysM&sCly freX !• i l«Ph«ciyTr9t»y«Tnrrro 



AtaAipSarTrpP i •«•**• r v» lL*uThrCly v*i x*pLy»cluThrS« rS« r 1 1 •Ttvr 



TTtexecxTCCcccAc*cccccsasAccrAcau ii^L^>LLJi. » . .x xt^agcca -53 
rti«MisA*p proAroOioC ly ProAjpL«uAUW* c frot.uAjplyrf h«A»eCiaArg 



TraKXTCeCACCTTCCJKACCCAATeCTrrACCGCTA^ >15 
UttAa*TtpClnV«irroH»«Al*H«tL«uTyrAr«Cftd ~" 



CCOCCATCXTt ^ XW C CCA?e x ; CCCCJC ASCACCTACC^AA . ~^ 315 

CCXCTCCrCACCITCCSCATCSXTCAa-^^. , .^J^UX-w XTCACCAT ? »$ 

KJJJZXCXTX-. .w-L*^«-.-AC»-. J«.J«L.-«^*«~*«*-»-iAC *-•< — *— AT ; ; 7 5 

JUUCACCCTCCXSTCCCCA7C L . U. . Jl«AAACCATCAA 1 WU..- «SC i J3S 

CCACTTACT^eACCAACCTCAMACCCCACCCCAACCTTT^^ : C 9 5 



Figure 3 conclnued 
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FIGURE 5ft 
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101 130 
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200 
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Wprot 
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RPS2 Transient Expression Assay 
Principle of the assay 
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1 « I * 30 

* *agctttaea gittggatga tctcttaatg 
61 acactcagtg gttetcgtta tgaatattte 
121 ceett?ttat ctgattttct gcaagagatt 
lil tgettgcaag ctetggatat atcacetttt 
i« "weeaggc caagrytgccg 

301 ttag«9gacg caa^cagcac tgagaaaat? 
J« gagtetgttg agataaagga 9 9 cea 9 aaaa 
«l tgtgagacga gtaa^aeaga tggaaaaage 
481 a^grgctgg aetacgatgc tggtteagtc 
541 aaagaeaaac tattgcacat rggcccttta 
«01 eatacagaac ttacagatct cgctgaaegt 
661 tctgteaagg gtcatattcc tgettggtat 
721 ^cttaa^t ttgttgaggc agaggcaaag 
tbi a^ttatagct ceeetaagac aaatggatta 
Ml ga W gcttt tacgttcta* getcgatttg 
901 gteaaagagg gettattgtg estaa^atea 
961 gageatgatg aagett^tg* tetratagca 
' ftt9tC4t *» a«ca*9«t ggcetattct 
1081 tetgiagttc ttgagaatac t.^cttgta 
*! 9aac *«9 aagttaetgt gcatgaagtt 
ttttCA9Ctt atactcaaag agcaaaegaa 
«W gaattaaagg ataaactaet tgfaroatca 
U21 atgceaggat t 999 ca*9ae taeaecagca 
13BI tzecgetteg atgteeatge teaatgtgtt 
1441 ttgcteacca ttttgaatga tgcgctcgag 
1301 gaaatagceg atgagetacg eqattmf 
13" gatgtgtggg aetacaaagt gtgggacaat 
*«1 aggagtagaa ttatcctaac aaeccgcttg 
i"l agtgaceeee at cat etc eg tttattcaga 
gaagtcccte aaggagagag ctgteeaect 
IWl aaaagttgta gagggctgee teceeeagtc 
1861 aagaagaeac tagaetcatg gaaagtagra 
i«i agcttggaa* agageatate tataattgga 
tcccetattt tCgaggattc 
M, tgtgggt w tgaagAgttt 

a« gatacccgca eaagg«tcx tg^acgacet 
JIO gaagagaect aaXgeca.gg tgaaaacgtg 
mi catggaaaag gccaaaeaa? aggatttect 
2281 ttiautue attaeaaaaa aaaagaaetg 
2341 actetcattt ec.tgtgttc tctttcattc 
2foi gattggaaga ataeegattg ttcgtteatt 
2461 catctcgetc taatgtcege -.ctttactit 
2521 ggccgcgtga tatctccrtc acetttgaga 
2381 aateattcaa eattggt 99 t acttttccca 
ItV: * cw 9 ft 99e eeaaactgat geaaatte** 
2701 ctgag.cttt tgtegtaaga ggattgggag 
J7« agatggtgaa atcgaggcat itacatgtaa 
2821 acatggatgt tttaacxggt aaeteacaat 
2"i gtctctttta tggtaaagac gcagagaaga 
2341 tgagttgcat attttcaggg aeatttggtt 
3001 getttcccag attagatttt ctaagrcacc 
3061 atccagecaa acttcctcae aagtteaatt 
3121 eaaagttceg tetaccttgg aeeeaaattt 
3181 rcettaagtt attgctcaga gcett^gaag 
3241 tcetagaaet eaaataecca aaaecggaea 
3301 atgatgcttt tcctaagctt gaaeatctgg 
3361 tcccttctcg tttt 9 aagac gergtttgte 
3421 ggaatgttgc caattcagce eia 9 acitec 
3481 atteatteae agttaccaca cagcctccag 
?«J * 9C **»« tt *9«cttget gtgtteatee 
3601 tacaccacaa cacgttcatc ctgecagtac 
3661 tatttcrgct: aeagceccaa ccttcertgg 
11V; Ctc * c 9««At ttttaa«=a etgc^ea^ 
3781 atattgtcat ecetageeaa acccacracg 



I «0 .30 
eatgecgaag cgactgcaaa 
acgaaeggaa gcagcactga 
gagtecg^ca aggxagagct 
ceeetgaeag acggagaagg 
aatgatgatg ctgtttctrc 
ggaettccat ctgactcccc 
ttatacgare aagrtrttgga 
tttatcaaca ttatgttaae 
tcctatcttc tcaaeeaaax 
cctgtagaca etgtaeagia 
grccaagata aaaaetaear 
*aeaeaecat atctctctga 
Rt«cu9te cgaaagcace 
ggacaretea attgeccctc 
araategaee taaaaeacca 
ttcattgace accttceaga 
agagcttc^g taatggeata 
cat cca erst gytaeaaagc 
aataaagttg ttgtjggagac 
9 eaaagaeca ecaetaaege 
g**atggagg gcttreagga 
eetgagertg atgtcacetc 
aagaagictt aeaatgatec 
gcgaeteaar catattcatg 
ccttccgatc gcaatgjaaa 
ttgaceaaga gaczettgat 
etatgracgt geeceagtga 
aacgatgeeg cc^aatatgc 
gat 9 ae 9 a 9 a gttggacatt 
gaaettgaag atgrgggatc 
Tigttagtag etggcgrces 
gaacaaagtc taagttccca 
ttcagttacX Agnttttce 
ttgeag 99 aa aggaratcea 
gtaeaa 9 eaa aeaaegaaaa 
tattiygtagg aatctggtga 
ecgeatteat gatttgetge 
tetceagate aacaggtaaa 
tattaatttt acxgtattat 
aatteagtgg agaag^gta 
ctcaeeaaga tga&attgat 
teaatgeaat tgatceagat 
geetsaaget tgtcaaagtg 
ctgaaataea at a t eta at t 
ttccttcatc tatagetaag 
gaga 9 at 9 at atcaccxtgt 
atgategggt ttettttget 
tacctaattt ggaaaeettt 
ttttgaggaa gatgecaaaa 
atceaag 9 aa actgaagggt 
-tgagteect eaagetggtt 
teecctegca aetaa 9 ggaa 
cgateattge agaactgece 
gggatcactg ggaagtgaaa 
acetcaa4.gr tgtaeaatgg 
ctttaaegaa atgtaageat 
taaatagagc tgaggtgaae 
aaaetatgea acae 9 aagtt 
attggcetaa agaaeagcee 
aagtgcattt aacatttatt 
tact t gat ae attaaaagaa 
getcaetcga ggtexagatt 
eaaetgtett at gat age tg 
tteaaatgaa aaetgatgte 



aaggttagca 60 
gaaaatgagt? 120 
eagaaacget 180 
eettgrtaat 240 
tgatggaagrt 300 
ecgagagatt 360 
tgeaaeacat 420 
eeaaeaggae 480 
excagtagtt £40 
ceggaatatg 600 
tegtttctte 660 
tgtcaageaa 720 
agattcttea 780 
aggcaaa&TC 840 
gattgaatca 900 
aagccacgtc 960 
eaa 99 etga 9 1C20 
tctttggatt 1380 
atgtgaaaga 1140 
ageaeeatet 12 CO 
taeaatagat 1260 
aateCttggc 1320 
agaagteaee 1380 
gagagagttg 1440 
agaagatgga 153C 
teteattg&t 1560 
tgttteaaat 1620 
eaaatgtgaa 1680 
ateaea 9 aaa 1740 
tgaaatatea 1809 
gaaacagaaa I960 
gaggattege 1920 
aeaetatett 1980 
tgAetcaaaa 2040 
aggacaagaa 2100 
tg 9 ceat 9 g« 2160 
ataaattct 9 2220 
aaaaaeegta 2280 
grttatgeea 2340 
tttcetgaac 2400 
ctgtggcgec 2460 
aacttgttat 2520 
ttggacttgg 2580 
cagatgaagt 2640 
ettgaaaate 2 "J CO 
tcaettctga 27 60 
ttgcgtgaga 2820 
tetaeteege 2880 
ttgagaaaat 2940 
aggcgtgt" c 3000 
tegaaeaget 3C60 
ctgaetttat 3120 
aaettggtga 3180 
gatteagagt 321 a 
cccatetetg 330C 
ett 9 agaaaa 3360 
cggtgcaaet 3420 
atagcaaatg 3480 
ett 9 actctt 33 40 
cattttgttt 3 500 
Ategaactc* 3660 
agatcaaegg 3720 
tgaaatgaea 37 ao 
acaaetaett 3940 
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3841 ttttgtgaaa 
3801 tgtcagetaa 
3961 gttetgtaga 
4021 cagagacage 
4081 ttacgttgaa 
4111 tttttectet 
4201 aacaaaaccg 
4281 taaateeata 
4321 taaaacagga 
4381 tgtgteaaag 
4441 9&cacaeeaa 
4301 attagattag 
4361 ggtagaeacg 
4621 tcgatttttt 
4 681 ttaaetarca 
4141 atgtaaaaaa 
4B01 CCgagaettg 
4861 gagcctgcac 
4921 tgaaeeaaaa 
4»8l ttagttgtaa 
50 <1 99Ctgtgtta 
3101 ttceggttea 
I 10 



tgtrtt* 

actctttgwc 
tttctatget 
*9aagctaaa 
CCettgaaa* 
ettagatgx* 
9agagaa*aa 
tggtaeaaec 
ttgagaaaaa 
ttcccctcct 
taceageega 
ttaetaacga 
tatatatgaa 
ttccccacag 
eegcggacga 
fcfgaggttct 
aaeeaeeeat 
taacraaret 
Tctaectctc 
*«gtaatgg 
cccttaeaaa 
rteattaetc 
I 20 



tr**. .get* 
aeeagaagtg 
tttgeagaat 
9tteaaggea 
gmagaagaac 
at erect eac 
taagatette 
gttrgaeaca 
aateettgea 
?a*gtggaae 
ctgtctcate 
actggtaaat 
gataeaege? 
agcatatatg 
aaeaageatg 
tegageggta 
gteeaaggea 
eestgtatga 
tttcttcttc 
tgaagtgttt 
eeggaateat 
atttcagtaa 
I 30 



caaaattgae 
-arttagaac 
atagtttaaa 
tttegxttae 
caggageagg 
ettgaaegtg 
ttatAtaaag 
atgatavaga 
cgatttteaa 
aageaatcag 
Ctggttaaet 

•99*accaaa 
taaecctagt 
aeetggeeta 
gcaacatttt 
catgtaagag 
gttgagatge 
«9*9*9aa* 
ttaafcggcat 
9***atatag 
ttesgtacaa 
gets 

l 40 



gaartgaeag 
raetgtgget 
aeaaeaaeas 
ttcragaaea 
taaagrcate 
aaeaecgetg 
eattateatg 
cgggagtttt 
tttctggcca 
aaaageTeat 
tagcettget 
tgcagttage 
egatggecaa 
aaagxttttg 
eaacaactat 
ttttgtgcac 
tagtaa*gaa 
gagaaaaaga 
taecttgaag 
ggagegatat 
Ctttcttctg 

t 39 



etc CX • 
ctatgaaaga 3960 
rtctcrgttt «Q20 
agtggagcte 40so 
tctttttatg 4 HO 
aaageatctt 4200 
taaatatgee 4 2 CO 
atagtataag 4320 
eateacaatg 4380 
teetateggt 44 40 
tacttagact 4300 
ctgaxgagec 4360 
Cttttcattt 4620 
etteaetaat 4680 
eaeteaagea 4740 
aeaagaggct 4800 
agaavaagat 4860 
tggagettea 4820 
eaeatgtttg 4960 
ttgaaagaac SO 40 
caacttttgg 5100 
3134 

I 60 
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1. Claims Nor: 

because they relate to subject matter not required to be searched by this Authority, namely: 



2. Q Claims Nos.: 

because they relate to parts o f the international application that do not comply with the prescribed requirements to such 
an extent that no meaningful international search can be carried out, specifically: 



3- Claims Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rufe 6.4(a). 
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This International Searching Authority found multiple inventions in this international application, as follows: 
Please See Extra Sheet. 
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This application contain* the folbwing inventions or groups of inventions which are not so linked as to form a single 
inventive concept wkr PCT Rule 13.1. In order for all inventions to be examined, the appropriate additional 
examination fees must be paid. 

Group I, claims 1-32, drawn to RPS oligos and polypeptides. 

This application contains claims directed to more than one species of the generic invention. These species are deemed 
to lack Unity of Invention because they are not so linked as to form a single inventive concept under PCT Rule 13.1. 
In order for more than one species to be examined, the appropriate additional examination fees must be paid. The 
species are as follows: 

1. The oligos of claim 1 

2. The oligos of claim 2. 

3. The oligos of claim 3. 

4. The oligos of chum 4. 

5. The oligos of claim 5. 

6. The oligos of claim 6. 

7. The oligos of claim 7. 

8. The oligos of claim 8. 

9. The oligos of claim 9. 
and 

1. The peptides of claim 14. 

2. The peptides of claim 15. 

3. Tlie peptides of claim 16. 

4. The peptides of claim 17. 

5. The peptides of claim 18. 

The species listed above do not relate to a single inventive concept under PCT Rule 13.1 because, under PCT Rule 
13.2, the species lack the same or corresponding special technical features for the following reasons: each group of 
oligos comprise a separate and distinct chemical entity which does not share a special technical feature with any other 
species. Likewise, each group of peptides comprise a separate and distinct chemical entity which does not share a 
special technical feature with any other species of peptide. 

Group II, claims 33-37, drawn to identification of plant disease resistance using biolistics. 

Group III, claim 38, drawn to an antibody. 

Group IV, claim 39-40, drawn to the Prf amino acid sequence. 

The inventions luted as Groups I and Q do not relate to a single inventive concept under PCT Rule 13. t because, under 
PCT Rule 13.2, they lack the same or corresponding special technical features for the following reasons: Group I is 
drawn to specific products and the use of those products to isolate disease resistance genes while Group U is not drawn 
to the use of any specific product to isolate genes but rather to the use of biolistics to isolate genes. The use of 
biolistics is a different inventive concept than the use of the specific products of Group I. Therefore, Groups I and 0 
do not share a special technical feature. 

The inventions listed as Groups I and III do not relate to a single inventive concept under PCT Rule 13.1 because, 
under PCT Rule 13.2, they lack the same or corresponding special technical features for the following reasons: Group 
III is drawn to an antibody which is a product that is chemically distinct from the products of Group I which are 
specific oligos and peptides. Therefore, Groups I and III do not share a special technical feature. 

The inventions listed as Groups I and IV do not relate to a single inventive concept under PCT Rule 13.1 because, 
under PCT Rule 13.2, they lack the same or corresponding special technical features for the following reasons: Group 
1 is drawn to specific nucleotides and peptides related to RPS while Group IV relates to a Prf amino acid sequence. 
These appear to be separate and distinct chemical entities. Therefore, Groups I and IV do not share a special technical 
feature. 
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The inventions listed as Groups II and 111 do not relate to a single inventive concept under PCT Rule 13.1 because, 
under PCT Rule 13.2, they lack the same or corresponding special technical features lor the following reasons: the 
antibodies of Croup III could not be used in the method of Group II. Therefore, Groups U and 01 do not share a 
special technical feature. 



The inventions listed as Groups 0 and (V do not relate to a single inventive concept under PCT Rule 13. 1 because, 
under PCT Rule 13.2, they lack the same or corresponding special technical features for the following reasons: the Prf 
amino acid sequence of Group IV could not be used in the method of Group II. Therefore, Groups II and IV do not 
share a special technical feature. 

The inventions listed as Groups VX and IV do not relate to a single inventive concept under PCT Rule 13. 1 because, 
under PCT Rule 13.2, they lack the same or corresponding special technical features for the following reasons: the 
antibody of Group IQ and the Prf amino acid sequence of Group IV appear to be separate, distinct and unrelated 
chemical entities. Therefore, Groups III and IV do not share a special technical feature. 

Accordingly, the claims are not so linked by a special technical feature within the meaning of PCT Rule 13.2 so as to 
form a single inventive concept. 
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